DESIGN AND ANALYSIS OF EXPERIMENTS FOR 
MODEL DISCRIMINATION IN 
UNIRESPONSE AND MULTIRESPONSE SYSTEMS 


A Thesis Submitted 

In Partial Fulfilment of the Requirements 
for the Degree of 

DOCTOR OF PHILOSOPHY 


by 

SANTOKH SINGH 


to the 

DEPARTMENT OF MATHEMETIGS 

INDIAN INSTITUTE OF TECHNOLOGY KANPUR 

AUGUST, 1986 



CERTIFICATE 


Certified that this work, entitled, " DESIGN AND 
ANALYSIS OF EXPERIMENTS FOR MODEL DISCRIMINATION IN 
UNIRESPONSE AND MULT IRE SPONSE SYSTEMS” by Santokh Singh, 
has been carried out under our supervision and has not 
been submitted elsewhere for a degree. 


■ /^|2 ■ 


D. Borwankar 
Professor 
Department of Mathematics 
Indian Institute of Technology 
Kanpur, India 




Musti S. Rao 
Professor 

Department of Chemical Engineering 
Indian Institute of Technology 
Kanpur, India 


August 17* 1986. 



. •<* ■ -- *' <% M 

1 8 !i iiJ £ v ij&f 

CENTRAL LIBRARY 

/. ' ’ ’• _r_ 

Acc. No. A- 


^ R6)€-j)'- ,Ct*4 - . L.A5T 



CERTIFICATE 


Certified that this work, entity "DESIGN AND 

ANALYSIS OF EXPERIMENTS FOR MODEL Dl SCR UMINAT I0N IN 

UNIRESPONSE AND MULTIRESPONSE SYSTEMS" v_ r Q . . . c . . 

by Santo kh Singh, 

has bean carried out under our super vlslon ^ has ^ 
been submitted elsewhere for a degree. 



J. 


?• Btrwankar 


Profe 


SSq 


^StS^-atics 

Kanpur, l^^ echnolo gy 


If f 


Musti S. jp>_ 
Profess ^ 0 

Department of Chemical 

Indian Institute of ^ En ^ :me 


enng 


Kanpur, Dacli| Chn0logy 


August 17 i 1986. 



CERTIFICATE 


This is to certify that Mr. Santokh Singh has satisfactor 
completed the course requirement for the Ph.D, program in 
Statistics. The courses credited by him include 

M601 Graduate Mathematics I 

(i) Complex Analysis 

(ii) Probability and Statistics 

M 6 O 3 Graduate Mathematics III 

(i) Functional Analysis 

(ii) Topology 

(iii) Measure Theory 

(iv) Linear Algebra 

M74l Advanced Estimation Theory 
M542 Regression Analysis 

| 

M84l Topics in Statistics s Admissibility of Estimators 

M592 Numerical Analysis 

M502 Computer Programming. f 

Mr. Santokh Singh was admitted to the candidacy of the Ph.D. 
degree in August, 1983 , after he successfully completed the 
written and oral qualifying examinations. 

[ 

; 

I 

i 

Head Convenor \ 

Denartment of Mathematics Departmental post-Graduate Commit 

Indian Institute of Technology 
Kanpur. • 



ACKNOWLEDGEMENTS 


I wish -to acknowledge my indebtedness to Professor J.D. 
Borwankar of the Department of Mathematics and Professor M.S. 
Rao of the Department of Chemical Engineering at IIT Kanpur 
for their useful criticism and suggestions during the course 
of this work. 

Acknowledgements are also due to other people who helped 
me directly or indirectly in the development of this work. I 
owe a great debt to my parents for their help and inspiration 
throughout my academic career. My wife Kuldip and son Vikram 
have been very patient throughout my Ph.D* program. I grateful 
acknowledge their cooperation. It is a pleasure to put on 
records my thanks to Professor J.L. Batra of IME Program who 
has been a constant source of encouragement. Besides, a numbei 
of my friends have been of considerable assistance; Paramjit Gj 
is to be mentioned ill particular. 

I would also like to egress my gratitude to Prof. J.B.Shi 
(former Head) and Prof. R.K. Jain (Head) of the Department of 
Mathematics for their help and cooperation. Thanks are also di 
to other members of the department for fruitful discussions I 
had with them at several occasions. The excellent job 
Mr. G.L, Misra has done in typing the manuscript deserves lot <j 
appreciation. I am also thankful to Mr. A.N. Upadhyaya for doij 
a neat job of cyclo styling. S 

Finally t I am greatly indebted to the Director, IIT Kanpui 
for allowing me registration in the Ph.D. program* 

Santo kh Singh j 


TABLE OF CONTENTS 


' LIST OF TABLES 
LIST OF FIGURES 
SOME NOTATIONS AND SYMBOLS 

. ...... .• t. • • 

SYNOPSIS 

Chapter Page 

1 INTRODUCTION AND REVIEW OF THE PREVIOUS WORK 1 

1.1 Systems and Models 1 

1.2 The Modelling Process 3 

1.3 Mathematization of a Process 6 

1.4 Review of the Previous Work in Model 

Discrimination 7 

1.4.1 Review of the Previous Works 

Uniresponse Case 8 

1.4.2 Review of the Previous Work* 

Multiresponse Case 26 

2 METHODOLOGY FOR MODEL DISCRIMINATION : 

DESIGN AND ANALYSIS 35 

2.1 An Approach to the Problem of Model 

Discrimination 35 

2.2 Search for an Appropriate Function 39 

2.2.1 Metric Properties of the 

Function, K 40 

2.2.2 Utilizing the Distance 
Function, K, in Model 

Discrimination 42 

2.3 Model Discriminations Nonsequential 43 

2.4 Model Discriminations Sequential 44 

2.4.1 Discrimination Criterion * 

An Index 45 

2.4*2 Design Criterion s A Weighted 

Function 46 

2.5 Development of the Discrimination 

Criterion 4 q 

2.5.1 Some Assumptions 4 q 

2.5.2 Evaluating the Akinness Between 
Two Normal Populations 


49 



vi 


Chapter Page 




2.5.3 An Important Observation 

55 



2.5.4 Distance for Assessment of the 
Appropriateness of a Model 

56 


2.6 

Termination Criterion 

58 


2.7 

Testing the Significance of the 

Distance Between Two Models 

60 


2.8 

A Model Adequacy Criterion 

64 

3 

DESIGN OF EXPERIMENTS FOR MODEL DISCRIMINATION 

IN UNIRESPONSE SYSTEMS 

66 


3.1 

Basic Assumptions 

66 


3.2 

Case 1 s Known, Homogeneous Variances 
of Error s 

67 


3.3 

Case 2 s Unknown, Homogeneous 

Variances of Errors 

74 


3.4 

Case 3 s Unknown, Heterogeneous 

Variances of Errors 

88 

4 

DESIGN OF EXPERIMENTS FOR MODEL DISCRIMINATION 

IN MULTIRESPONSE SYSTEMS 

97 


4.1 

Basic Assumptions 

97 


4.2 

Case 1 s Known, Equal Covariance 

Matrices of Errors 

98 


4.3 

Case 2 • Unknown, Equal Covariance 

Matrices of Errors 

107 


4.4 

Case 3 * Unknown, Unequal Covariance 
Matrices of Errors 

118 

5 

SOME 

THEORY OF ESTIMATION 

125 


5.1 

Estimation of Model Parameters 

125 


5.2 

Estimation of Variance (s)/Co variances 
of Error (s) 

129 


5.3 

Estimation of Z u and Under 




Different Types of Assumptions 

135 

6 

APPLICATION EXAMPLES 

138 


6.1 

The Scheme for Implementation of the 
Discrimination Procedure 

138 



vii 


Chapter 


Page 


6.1.1 Simulation of Data 

6.1.2 Estimation . of Model Parameters 

6.1.3 Discriminating Among Model 
and Design of Additional 
Experiments 

6.1.4 Optimization of Criterion 
Functions 

6.2 Discrimination Among Univariate Models 

6.2.1 Linear Models s Known, Equal 
Error Variances 

6.2.2 Nonlinear Models s Unknown, 
Homogeneous Error Variances 

6.2.3 Nonlinear Models s Unknown, 
Homogeneous Error Variances 

6.3 Discrimination Among Multivariate Models 

6.3.1 Nonlinear Models s Known, Equal 
Error Covariance Matrices 

6.3.2 Nonlinear Models • Unknown, 
Unequal Covariance Matrices 
of Errors 


139 

141 

142 

143 
145 

145 

156 

164 

172 

172 

182 


APPENDIX 

A ' COMBINING TWO QUADRATIC FORMS 18 5 

B SOME RESULTS OF MATRIX ALGEBRA 188 

C JOINT CUMULANT GENERATING FUNCTION 194 

C.l Univariate Case 1^4 

C.2 Multivariate Case 198 

D SOME USEFUL RESULTS AND FORMULAE 203 


REFERENCES 


207 



LIST OF TABLES 


TABLE 


3.3.1 


4.3.1 

6.2.1 
6.2.2 

6.2.3 


6.2.4 


6.2.5 

6.2.6 


6.3.1 

6 . 3.2 

6.3.3 

6.3.4 


Page 


Values of the coefficients, 
a ip* ^jq 


Values of the coefficients, 
a iks* PjJfcfc 

Estimates of model parameters at 
different stages 


82 

113 

147 


Sequential discrimination among 
polynomial models '• the present 
and the Box-Hill procedures (True model.M ■) 148 

Weights for the criterion function 0 

at different stages; discrimination 

among polynomial models 154 


Sequential discrimination among 

nonlinear models : the present, 

the Box- Hill, and the Buzzi et al. 

procedures (True model*M(5)) 159 

Weights for the criterion function 0 

at different stages; discrimination 

among nonlinear univariate models 165 


Sequential discrimination among 
nonlinear univariate models s 
the present and the Box- Hill 

procedures (True modelsM(2)) 169 

Sequential discrimination among 

nonlinear bivariate models t /•, \ 

the present procedure (True modeUM' 1 ') 176 


Weights used in the criterion 
function 0 at different stages; 
discrimination among nonlinear 

bivariate models 176 


Sequential discrimination among . 
nonlinear bivariate models s 
the procedure of Buzz! et al> 

(True model.M^)) 

Discrimination index and posterior 
probability in discriminating 
among nonlinear bivariate models 


182 



LIST OF FIGURES 


FUGURE Page 

6.1.1 Scheme for illustration of sequential 

discrimination 140 

6.2.1 Status of models as determined by- 
discrimination index; linear 
■univariate models. True model: 

Model 3 150 

6.2.2 Change of weights w_, v from first 

u.£ \r j 

to second stage; linear univariate 

models 153 

6.2.3 Progress in discrimination through 

the proposed procedure; nonlinear 
univariate models. True model: Model 5 163 

6.2.4 Change in weights w , from first 

v f K 

to second stage; nonlinear univariate 

models 166 

6.2.5 Performance of discrimination index 

in identifying the true model; 

Model 2 170 

6.2.6 Change in the posterior probabilities 

at different stages. True model: Models 170 

6*3.1 Performance of the proposed procedure 

in discriminating among nonlinear 
bivariate models. True model s 
Model 1 

6*3.2 Change in weights w u from first 

to seo 
models 


178 



SOME NOTATIONS AND SYMBOLS 


Notations 


ABS 

E 

exp 

E (u.) 

sup 

tr(A) 

t r 

A 

0 


absolute value 
expectation operator 
exponential 

the expectation under model u 
sup remum 

trace of the matrix A 
r-variate t-distribution 
criterion proposed in literature 
the criterion proposed in this work 


Symbols 


(u) 


* 


<rv/ 


n 

ir 

n 

|A| 

OK 

z. 

Si 

0 


(as superscript); under model u 

(on top) maximum likelihood, estimator, (m.l^eO 

(on top) some estimator 

(on top) some estimator also used for some constants 

(on top) some estimator 

(at bottom) a vector 

(at bottom) a two dimension array 

a statistical population 

product 

a process 

determinant A 

proportional 

the summation 

log likelihood 

estimation criterion function 

■transpose of A 


SYNOPSIS 


The significance of model building in a system study 
approach is well recognised. The requirement is often a 
mechanistic model, i.e., a set of mathematical equations 
which can describe the physical mechanism of the underlying 
process reasonably well. So far as arriving at the final 
form of an appropriate model is concerned, the whole exercise 
may not be as simple as one ejects, for it may not simply 
be the estimation of the unknowns appearing in the equation (s) 
of the given model. The investigator frequently comes 
across situations where several models have been postulated 
for the same system. This asks for discrimination among the 
competing models and selection of the best, before one proceeds 
to the task of model fitting. Further, it may be noted that 
the occurrence of situations which involve more than one 
response is not uncommon. Besides, the models nonlinear in 
parameters cannot be debarred from competition. 

For the present study we identify our problem as model 
discrimination in uniresponse and multiresponse systems. The 
cases of both the linear and nonlinear models have been planned 
to fall under the purview of this work. The task of model 
discrimination is normally accomplished sequentia l 1 y. a 
procedure for this purpose, therefore, consists of the 
application of a discrimination criterion and, if required, 



the use of a design criterion. Of the several procedures 
proposed in literature to that end, some have limitations 
while a few others suffer from certain drawbacks. For example, 
the methods proposed for bimodel problems cannot be applied 
to multimodel situations; the methods devised for discrimination 
among linear models may not work for nonlinear models; and 
a procedure meant for a uniresponse system can not handle the 
problem if the system happens to be of multiresponse nature. 
Furthermore, the drawbacks in some of the available procedures 
make one hesitant in using them* For example, the decision 
taken through the procedures which use posterior probability 
as a model adequacy criterion may not be reliable unless the 
observations from a large number of designed experiments 
have been utilized. Similarly, the methods which neglect the 
covariance structure of prediction errors may not do full 
justice to all the models, while the demanding nature of 
certain procedures sometimes makes their use restrictive'. 

In Chapter 1, where we introduce the problem, the limitations and 
drawbacks of some procedures proposed in literature have been 
discussed in detail* 

In the present study we attempt to resolve some of the 
difficulties which an investigator is likely to face in actual 
practice. Now, it may be borne in mind that the basic step 
in building a mechanistic model for a system is to establish 
the true nature of the underlying process* To that purpose it 
is important to appreciate that randomness is somehow imbued 


xiii 


into the process through one or the other sources, thus 
rendering it probabilistic in nature. The observations arising 
from a system as a result of certain inputs can, therefore, be 
presumed to be realizations of some random variable (s), Y, which 
may be characterized through a model in terms of the physical 
parameters of the process and the input variables. We, therefore, 
assume probability distributions characterizing the populations 
supposed to have been generated by the rival models. In a 
situation where several models have been proposed there would be 
a distribution due to the true model (hypothetical), df ^ ^ 

(say), on the one side and a host of alternative distributions 
under the proposed models, df^ u \ u = 1,2, ...,m (say), on the 
other. Based on the distributions df ^ and df^ we develope 
an index, called Discrimination index, through which the claim 
that the proposed model, u, is the best alternative to the true 
model can be verified. This index, in fact, decides the relative 
adequacy of a model. If the available set of observations do 
not possess sufficient discriminatory power the above index may 
not show enough evidence in favour of a single model'. In that 
case a few more observations acquired through experiments specific 
to the purpose of discrimination may prove to be helpful-# This 
necessitates the use of a design criterion. The criterion that 
we propose in this work is based on the alternative probability 
distributions of the random variable (s), on which a 

discriminatory observation is yet to be realized. This is, in 
fact, a weighted function involving all possible pairs of the 


xiv 


rival models. The weights employed in this f unction are 
formulated to meet the basic requirement that a pair consisting 
of distinct models receives less importance and the one with 
close rivals is allowed to play a greater role in designing 
an experiment* It may be noted that the variance-covariance 
structure of errors has been given due consideration in the 
formulation of both the design and discrimination criteria* 

It has been argued that the maximization of the proposed design 
criterion produces optimal setting of the experiment, All this 
has been done in Chapter 2, In this chapter itself we introduce 
a statistic for testing certain hypothesis relevant to the 
problem. 

In Chapter 3, we subject the general design criterion to 
different sets of assumptions when the system is uniresponse. 

The criteria thus obtained are applicable to linear as well as to 
nonlinear models. Chapter 4 deals with multivariate linear 
and nonlinear models. Once again the design criterion is exposed 
to the multivariate analogues of the same sets of assumptions. 

The development of the different forms of the basic design 
criterion is based on the Bayesian approach. The proofs of 
some results used in these two chapters for arriving at the 
final forms of the criteria have been given in the Appendices. 
Before the formulae developed in Chapters 3 and 4 can be applied 
to a specific problem one needs estimates of certain unknowns* 
such as parameters of the models and the variance— covariance 



XV 


structure of errors. Some theory of estimation to that effect 
has been given in Chapter 5 from where a suitable estimation 
criterion or an appropriate formula can be chosen according to 
the assumptions applicable in a given situation. In the last 
chapter, i.e., Chapter 6, we demonstrate the implementation 
of our procedure comprising of design and analysis for model 
discrimination. This is done through Monte Carlo simulation. 
The results thus obtained are compared with those reported 
elsewhere. Through application to varried problems, the 
proposed procedure has been found to converge faster to the 
true model. Besides, the emphasis is always on the best model 
followed by less apt ones rather than on bad models 1 . The 
discrimination thus achieved looks pretty sharp. The index 
which is employed for assessing the discrimination shows a 
consistent trend and does not oscillate as the posterior 
probability does. The proposed weights, too, have been seen 
to be appropriate. 



CHAPTER 1 


INTRODUCTION AND REVIEW OF THE PREVIOUS WORK 


1.1 SYSTEMS AND MODELS 

In the present day world 'system' is perhaps one of the 
most widely used concept in scientific investigations. The 
problems of mechanics are solved conveniently after identifying 
a mechanical system- the biologist thinks of an organism as a 
biological system $ the chemical engineer studies a reaction 
through a chemical process control system jan economist* 
studying the economy of a state, has in his mind an economic 
system f so on and so forth. The system study seems to be the 
way of investigation in almost all the disciplines. In whichever 
field one may be using this concept, the term system has a 
common underlying meaning. In fact, one can think of a system 
as a unit consisting of interdependent elements or components 
which interact regularly so as to complete an implicitly or 
explicitly assigned job. This is the world inside the system. 
But, there is a world outside, too, which exercises its influence 
through the input-. Whenever the system receives an input the 
process operating in the system transforms it into an output 
through interaction of its elements. If the underlying system 
is deterministic, a particular input will always result into 
the same output. On the other hand, a probabilistic system 



2 


may respond with any one .from amongst a range or distribution 
of outputs. It can, however, be argued that observations or 
measurements on the output of a system are random rather than 
deterministic. In certain cases randomness is inherent in the 
system, while in others only certain manifestations are observed 
because of insufficient information about the system response (s) 
or a lack of techniques to observe the output. Often, the 
observer is just negligent or careless. In general, systems can, 
therefore, be treated stochastically. 

Unfortunately, the behaviour of a system is not always 
exemplary s a biological system may be disturbed by a disease; 
a chemical system is subject to catalyst decay 1 an ecological 
system is likely to be upset by pollution; and an economic 
system may be interrupted by inflation} etc. One must, therefore, 
attempt to save a system from external disturbances so as to 
improve the quality of its performance. There are several ways 
to do this. One can build a new system and discard the old one. 
But, when such a replacement is not possible, as is generally 
the case, the only alternative is to live with the existing 
system. Efforts can, however, be made to provide it with 
proper care and exercise supervision. Whatsoever be the case, 
the original problem always arises in the real world} sometimes 
in the controlled conditions of a laboratory and sometimes in 
the much less understood environment of everyday life. In any 
case, a system study, aimed at an effecient control of the system, 
is always concerned with the process (a series of events) which 



3 


generates one or more responses as a result of certain input. 

The probable future course of events must be predicted in 
order to evolve ways for imposing forces on the pertinent 
system so that it may be moved into a direction deemed to be 
desirable. The use of the actual process for this purpose may 
not be feasible in certain situations and not desirable in 

others for economic reasons. One may, therefore, resort to 

* 

modelling. In ordinary language the word 'model* has many 
meanings. But the model which we use in this study is composed 
of algebraic equations rather than being a miniature of the 
system or something else. Such a model would, no doubt, be a 
mere mathematical representation of the underlying mechanism 
of the process, but could still serve the purpose with reasonable 
faithfulness. In fact, this provides us with the most powerful 
analytical tool for studying and investigating a system. The 
objectives, such as, control of the process and prediction or 
optimization of the output can be conveniently achieved through 
a mathematical model. It is for these reasons that almost all 
the disciplines which adopt a' system approach* also use the 
concept of model as a basis for developing solutions to various 
relevant problems. 

1.2 THE MODELLING PROCESS 

A model may be a mechanistic representation of the system 
or just an empirical relation, depending on the purpose of an 



4 


investigation. In a given situation, if the aim is to predict 
or optimize, an empirical model might be entirely satisfactory. 
On the other hand, if the primary interest is not merely to 
predict the response (s) over a limited region, but rather to 
elucidate the mechanism or to obtain meaningful results in 
regions relatively unknown then a mechanistic representation of 
the process may be more appropriate. In fact, a mechanistic 
model can provide useful information where the optimization 
of the design of a system leans heavily on what is actually 
taking place. But, unfortunately, the model, even if it is a 
mechanistic one, is seldom exact. One must, therefore, satisfy 
onself with a close approximation so long as it is capable 
of simulating the mechanism and, thus, represent the real 
world phenomenon to a reasonable extent. 

The utility of a mathematical model to the scientists 
and engineers is immense, in that, it is a useful tool for 
compressing large amount of information, providing new insight 
into the process, and suggesting ways for the future development 
of the system. The process of modelling, therefore, is an 
important operation and ought to be carried out in stages,* 
crossing each stage carefully. While one has to be very 
particular in formulating the requirements of a model, one must 
also be fussy in hypothesizing its form. If there happens to 
be a single model considered appropriate for describing the 
process of the system, the only job of the investigator is to 



5 


secure precise, estimates of the parameters, i.e. , the unkonwns 
appearing in the model equation(s). The problem is of 
estimation , But, things may not always be that simple - . 

One might find oneself at a point where "all roads seem to 
lead to Rome ". That is to say, there may be several models 
capable of explaining the mechanism of the system* In such 
situations, a model c an be actually put to use only if a 
single model is picked up out of the given lot and fitted 
precisely through the available data. The problem, now, is of 
discrimination among the equally plausible models*. The 
investigator must take a decision on the right path to be 
followed. Once the most appropriate model has been selected 
from amongst the given lot, one may start brushing up the 
estimates of its parameters. Discrimination may either be 
possible with whatever data are available at hand in the 
beginning or one may require additional observations*. In the 
latter case, it is naturally desirable to have such observations 
as can help in a fast and sharp discrimination. This needs 
designing additional discriminatory settings of the input 
variable (s). Thus one starts with a discrimination criterion 
but may require a discriminatory design criterion if the given 
set of observations do not prove to be strong enough to pick up 
the best model. 



6 


1.3 MATHEMAT I ZAT ION OF A PROCESS 

Quite often, the experimenter deals with such situations 
where only one characteristic of the output is being observed 
either because the output is unidimensional or the interest 
of the experimenter centers around one aspect of the system 
only. For example, in an experiment, one might be interested 
only in /the tar content of a gas stream as a result of input 
of gas inlet temperature and rotor speed. But, situations are 
not uncommon when observations from the process come as 
measurements on more than one characteristic of the system. 

For example, in a study of a chemical process, with each input, 
one might observe yield and density of the product* 

Consider a system whose mechanism is made up of p 
parameters, 0^,© 2 » . . . , Q^, and suppose that it responds in terms 
of the r dependent variables, Y i» Y £, * ** > Y r> as a resu1 ^ o:f - 
certain input of the q independent variables, 2 , ****^a* 
Mathematically, such a system can be represented by a set of r 
equations 

E ( Y f) = ^i ^1 , ^2* * ** * ^q 5 ®1* 0p) * - 

i = 1,2, ...,r, (1.3.1) 

where r «* 1 or r > 2, according as the system is uniresponse 
or multiresponse and the function can be linear or no nlin ear 
in parameters, 9 19 Q 2 f* »© p * In fact, (l.3.l) comprises the 
true mathematical model of the given system, but is no more 
than a mere hypothesis. One, therefore, seeks a close 



7 


mathematical representation. Specific to this study, we 
consider a situation in which not one but a number of physically 
meaningful models have been postulated. To start with, these 
models are thus assumed to describe the process equally well* 

The given system can, therefore, be alternatively represented 
by any one of the m(say) rival models : 


E (u) (q) = »)| u) (£ 1 ,S 2 ,....S q ; 4 U) -4 U) 8 p" )> > 


i ~ 1 f 2 . f 0 0 « f r\^ 


u 


1,2,*.., nr, (l*3*2) 


The modelling process in this case consists of, first, identifying 
'•the* model and then proceeding to improve upon -the estimates of 
its parameters , if required. 

1.4 REVIEW OF THE PREVIOUS WORK IN MODEL DISCRIMINATION 

Which of the m models best represents the system? This 
has been one of the fundamental questions confronting researchers 
engaged in building models for various types of systems. 
Discrimination among the given models may provide an answer. 

In fact, this problem of selection of the most appropriate model 
from amongst the proposed ones has been widely discussed in the 
literature. Reilly and Blau (1974), Hill (1978), Singh and 
Rao (1981 ), Iyengar and Rao (1983, 1984), and several others 
have given extensive reviews of model discrimination procedures. 

The entire work in this field c an be put into two broad categories! 
namely, the Bayesian and the Non-Bayesian. However, the various 
procedures proposed in the literature for discrimination among 



8 


rival models have one thing in common i almost all of them use 
the concept of divergence in one or the other way, 

1,4,1 Review of the Previous Work s. Uniresponse Case 


Bimodel Case s Hunter and Reiner (1965) are, perhaps, the 
pioneers in demonstrating the use of divergence' in designing 
'experiments for model discrimination. In order to discriminate 
between two regression models they assume that the observations 
are normally distributed about zero mean with a constant 
variance, o 2 . According to them, an experimental run, (n+l)th(say), 
result into a discriminatory observation if it is 
conducted at a setting of the input variable (s), £ , where the 
two fitted surfaces are farthest appart. But, in the process 
of optimization this means an immense effort involved in 
computing estimates of the model parameters (especially, in 
case of nonlinear models) for every choice of jtn+l 
operability region. Hunter and Reiner have, however, resolved 
this difficulty by proposing an approximate design criterion. 


ViW> = 


(U) _ „(2>(| 


~n+l , ^n 


e^)} 2 , d.4.1) 


where the divergence between the responses, expected at the 
(n+l)th run, is always evaluated through estimates, , £^ 2 ^ f 


of model parameters, based on the available n observations. The 


experimental points. are recommended to be designed, sequentially, 
by solving the equation (1.4.1) at each stage till the adequacy 


of one of the models is confirmed. 



9 


Although, developed through the assumption of the truth 

of one of the models, taken alternatively, the criterion in its 

final form does not require any such assumption. This, in fact, 

is the result of combining the two noncentrality parameters into 

one function which, therefore, becomes symmetric in terms of 

models. In actual practice, the investigator is ejected to 

confirm in the beginning if both models are adequate, neither 

model is adequate, or one model is adequate. It is only in the 

last situation that designing of additional experiments is 

required till it is determined as to which of the two models 

is adequate. As regards the testing of adequacy, the authors 

have proposed the use of F-test, though on heuristic basis only; 

theoretically, the application of such a test is not valid. The 

2 ' 

replications needed to obtain a valid estimate of a , as is 
required for this test, are claimed to occur naturally through 
the proposed design criterion. The authors have, however, 
suggested to deviate a designed point, slightly, if it could 
become a replicate. 

The criteria which Atkinson and Fedorov (1975 a) used for 
designing discriminatory experiments in a bimodel situation are 
also based on the noncentrality parameter, (i ? «e* the sum 

of squares for lack of fit) of the second model, after it has 
been assumed that the first model is true and that its parameters 
are known* The sequential and nonsequential design criteria, 
actually proposed by them with different types of requirements, 
have been basically developed through the equation 



10 


A P (£*) = sup x P (g) , (1.4.2) 

l 

where 2* is the resulting design, and £ is a normed measure 
defined on the compact design space. The solution of this 
equation will result into what they have termed as locally 
optimal design. In an equivalence theorem they have laid down 
the necessary and sufficient conditions for a design to he 
T-optimum as well as listed some of the properties of such a 
design. This theorem is, therefore, useful in confirming the 
optimality of a given design or for investigating a proposed 
numerical procedure for the construction of a discriminatory 
design. Atkinson and Fedorov (1975a) have, in fact, established 
through this theorem that the procedure of Hunter and Reiner 
(1965) produces designs which are asymptotically T-optimum. 

Nevertheless, the optimal designs of Atkinsoh and Fedorov 
(1975 a) greatly depend on the information as to which of the 
two models is true as well as on the values used for the 
parameters of this model. But, if one already knows about the 
true model to that extent the whole exercise of construction 
becomes redundant. The alternative approaches suggested for 
getting rid of this restriction, too, may not prove to be 
fruitful as they further lead to other requirements such as, 
linearity of the models and prior information about the rival 
models and their parameter distributions. Thus one may find 
the use of their nonsequential method restrictive and difficult. 

The practical utility of their work mainly lies in their sequential 



11 


procedure. This procedure consists of starting with an initial 
design, estimating the parameters of both the models through 
the least squares method ,and then taking the next experiment 
at a point where the divergence between the two' expected responses 
is maximum. The designing of experiments is continued till a 
test of adequacy confirms the suitability of one of the two 
rival models. This leads to a design which, no doubt, would 
be T-optimum but has, nevertheless, certain shortcomings* 

According to them,the design thus realized would be data dependent 
and would require a considerable number of trials to wipe out the 
effect of a large error in observations. Furthermore, the 
procedure may lead to a singular design. Thus in order to 
construct an efficient design it is desirable that the 
observations be as free from experimental error as possible. 
Besides, the investigator must start with a nonsingular design. 
Added to all these is the requirement of appreciable replication 
for testing adequacy of the models. Nevertheless, the availability 
of the desired replications, too, may not work if one model 
happens to be a degenerate form of the other* In this case the 
Inclusion of another suitable model may be a way out. But, this 
might further complicate the task of discrimination, in that, it 
leads to a multimodel discrimination problem. 

Multimodel Case * The use of both the criteria discussed so far 
is limited to bimodel problems only, while in practice the 
investigator might have to choose from a set of more than two 
models. Using rather a Bayesian approach Roth (1965) has 



12 


developed a criterion, based on Hunter and Reiner’s idea only, 
which could be used in a multimodel problem. This criterion is 
a weighted average of the total separation among m rival 
models; the weights being the prior probabilities of models* 

The (n+l)th experimental point is chosen at the maximum of 


=^ P n U) l 7 ’ (V)( in + l>?i V>) - T,(u)( t 1+ l'^ U)) l ] - 

V/£u (1.4,3) 

where is the prior probability and a re the estimates 

of parameters of model u, based on the previous n observations. 

After each sequential run the probabilities, used as weights 

in the criterion function, are updated through the Bayes' 

formula and used for decision making. Reilly (1970) has 

criticized Roth's criterion on the ground that it fails to 

consider uncertainty associated with prediction. It may be 

argued that it is not safe to use this criterion if the 


parameters of all the models are not estimated with the same 
precision. In fact, large divergences due to imprecise estimates 
of certain model parameters might unnecessarily inflate the 
criterion value. In this case, the effect of separations 
resulting from precisely estimated responses is subdued. Since 
the criterion does not take into account the magnitude of the 
precision of the predicted responses this may result into a super- 
ficial maximum and consequently Into a wrong design. Another 
shortcoming of this criterion lies in its slow convergence to 
the desired discriminatory level, when used for discrimination 
among mathematically similar models-. 



13 


Froment (1975) has proposed a straight forward extension 
of the Hunter- Reiner criterion to multimodel situations. This 
consists of maximizing 


A u^!n +1 


m-i m 

) = > >_ 

U=1 v=u+l 




with respect to § n+1 , the (n+l)th setting of the input variable (s). 

Hosten and Froment (1976) have, instead, preferred the absolute 

value of the difference between the responses predicted through 

models u and v. Discrimination in both the methods is done by 

screening out bad models, sequentially. So far as the detection 

of ill-fitting models is concerned they use the fact that the 

mean sum of squares for lack of fit of a model will be an 

unbiased estimate of the error variance for the correct model 

2 2 

only. Thus, if o is known or an estimate of o is available, 
the inadequate models are discarded at any stage through an 
F-test. In case of complete ignorance about the magnitude of 
variance of errors this 30 b is proposed to be done by means of 
Bartlett's statistic. This statistic, basically, tests the 
homogeneity of the estimates of obtained through the rival 

models. At any stage the model with the largest estimate of cr^ 
is dropped if x exceeds the tabulated value. This way more 
and more models are eliminated as discrimination advances from 
one stage to another. The application of this statistic requires 
each model to be at least locally linear. Besides, x^ being 
sensitive to departures of data from normality, care has to be 
taken with outliers. So far as the independence of estimates of 



14 


variance is concerned, Froment (1975) has observed that the 
procedure is not sensitive to it. Domez and Froment (1976) have 
used A 4 (|^ n+1 ) as the design criterion function for discriminating 
among fifteen rate equations of dehydrogenation of 1-Butane to 
Butadiene on a chromium-aluminium-oxide catalyst in a differential 
reactor. The technique worked successfully in picking up one 
equation as representative of the system under consideration. 

Another extension of a bimodel discrimination method to the 
case of several models is proposed by Atkinson and Fedorov (1975b). 
Once again* they utilize the concept of maximizing the noncentra- 
lity parameter of a model. In case of m rivals this requires 
simultaneous maximization of (m-l) noncentrality parameters, 
assuming that one of the models is true with known parameters. 

To that purpose they choose to adopt a maxim-in approach in solving 
a continuous design problem and title the designs thus obtained 
as T-optimum. Like its bimodel analogue, an equivalence theorem 
prescribes the necessary and sufficient conditions for a design 
to be optimal. This theorem is useful in determining the 
optimality of a given design but has little use in constructing 
an optimal design. The authors have, however, proposed sequential 
procedures for actually realizing such a design, mainly, in 
three types of situations. If there is one closest rival, as 
can be seen by comparing the residual sum of squares, the new 
experiment is recommended to be chosen in such a way that the 
divergence between the responses predicted through the best fitting 
models is maximum. The sequential method of Atkinson and Fedorov 



15 


may produce a nonoptimal design if there happens to be more 
than one equally close models. If there are two closest rivals, 
the situation can still be managed through the iterative 
procedure proposed for this case. But the case of several 
equidistant models is difficult to handle, 

Atkinson and Fedorov (1975b) have also proposed non- 
sequential methods for constructing an optimal design under 
certain assumptions which are not different from the ones 
required in their bimodel analogues. The occurrence of equality 
of two or more noncentrality parameters may once again add to 
the difficulty which the investigator may already be facing in 
meeting some of the basic requirements such as, linearity of 
rival models and complete knolwedge about the true model or else 
prior information in the form of the probability distributions 
and prior probabilities of the models being discriminated. 

We now discuss those criteria which are based on the 
argument that in discriminating among models the point of 
maximum divergence between the expected responses may not 
necessarily be the point of maximum discrimination. In fact, 
some researchers in this field have felt that it is the 
divergence in responses of models relative to the limits of 
their errors which plays a more important role in generating 
a good discriminatory point. According to them, the variances 
of the estimated responses must, therefore, come into picture. 
These researchers have actually taken note of this fact in the' 
development of the design criterion. 



16 


For example, Fedorov and Pazman (1968) have included, in 
addition, the variances of the predicted responses'. Assuming 
that the variance of the observations, e^G^^) , is a known 
function of the independent variable (s) and that the errors 
are normally distributed about zero, they propose to maximize 


^-n+l^ “ £" 


°u<W +0 W 


] 




+ <°u ( ln + l ) - »V 


(1.4.5) 


under the null hypothesis that model u is the true (u=l,2) one, 

2 

where ^ ^~n+l^ ^ lle variance associated with the prediction, 


r ^^ U ^£n+l , ®n U ^* 111 " the absenCe °f knowledge about the 

true model they suggest the use of the smaller of the criterion 
values or some weighted combination of the two-. The criterion 
function (1,4.5) seems to have originated from an earlier work 
of Pazman and Fedorov (1968), where the null hypothesis is about 
the unbiasedness of one of the two sets of estimates of the 
parameters of a given model rather than the truth of one of the 
two models. The criterion given by equation (1.4.5 ) has certain 
limitations.. First of all, it can not be used if there are more 
than two competing models. Secondly, the requirement of the 
variance of observations to be a known function of the independent 
variable (s) can not be met so easily in actual practice. 

The criteria discussed so far have been seen either 
neglecting the prediction variances or if they do not, they are 



17 


applicable to bimodel problems only. Box and Hill (1967) have 
proposed a criterion which not only takes into account the 
prediction errors but can also be used in multimodel situations. 
Besides, it does not require one of the rivals to be known in 
advance as a true model. This criterion, which utilizes the 
Shannon *s (1948) concept of entropy, has been basically 
developed through the maximization of the change in entropy, 
R(say), expected due to the (n+l)th experiment yet to be 
conducted. Box and Hill have, however, used its upper bound, 

D (say), for designing a new experimental setting in a multi- 
model problem. In particular, for discriminating among m 
models they have assumed that under model u the errors are 
normally distributed about zero mean with a variance, cr^ (known 
and constant for all experiments )j that the models are linear 
or approximately linear in the neighbourhood of certain estimates 
of their parameters ; and that a locally uniform prior (non- 
informative) distribution is appropriate for the parameters 
of each model. With these assumptions they have obtained D to 


be of the form 


Vs 


fcn+l 


m- l 

> - n 


U=1 v=u+l 


p(u)p(v) 
n n 


g u~ ( 

t 2 + o 2 ) (a 
u 


{n (u) (s 

%1H 


■i u) > 


* (v) <w Si’ 




18 


is the maximum likelihood estimate (m.l.e) of 0^ u ^ based 

^11 rs/ 

2 

on the available n observations, and o u is the variance of 
V ^ (J n +1 , C'L U ^* 111 or<ier bo decide upon the input for the 

(n+l)th exponential run, the criterion function ^(jln+q) is 
maximized with respect to £ n+ , . This is followed by a revision 
©f the model probabilities , by means of the Bayes 1 

formula 


3 (u) 

k 


>(u) f (u ) , ' 

k-1 1 ^ y k' 


^ * (v) (y k ) 


* k = n+1,... J u = 1,2, ...,m, 

(1.4.7) 


where f^ u ^(y k ) is the probability density function (p.d.f. ) 


of under model u and P^-i is bhe Probability of model u 
at the previous, (k-l)th, stage. The quantity is utili 

in assessing the status of each model at every stage in the 
sequential process which is terminated as soon as one of the 


given models emerges as the best model, i.e., the model with 
the highest posterior probability, (close to l'.O). 


The Box-Hill procedure has been actually used by several 

* 

investigators, but all of them have reported varied experiences 
with its implementation. Hunter and Mezaki (1967) have 
successfully applied this procedure to some practical problems 
in chemical engineering, while Froment and Mezaki (1970) and some 
other investigators faced certain difficulties in using it. The 
procedure has been subjected to considerable criticism which has 



19 


led to some modifications. Meter et al. (1970) and Reilly 
(1970) have expressed their surprise over the criterion itself; 
the maximization of the upper bound of expected entropy change, 

$ P for realizing a design point looked strange to them. 

Reilly (1970) and Fedorov (1972) have instead obtained 
approximations to R. However, the examples in which these 
approximations have been used as a criterion function do not 
exhibit any significant difference in designs. Box and Hill 
(1967) have employed weights in the design criterion to ensure 
more emphasis on pairs of models with high probabilities, but 
Meter et al, (1970) are doubtful if these weights really do 
the job. In fact, if a term inside the square brackets in ^ 
dominates the probability weightings, the criterion may select 
experiments carrying large amount of information in support 
of the models which are already on the verge of being dropped 
out. Buzzi and Forzatti (1983) suspect the criterion to pick up 
those points where the discrepency between variances of estimated 
responses is large, even if the divergence between the two 
predictions is small or negligible. It has also been observed 
that the design point obtained at the maximum of D may not be 
the one which maximizes R, although the proposed criterion is 
claimed to be based on the maximization of the expected entropy 
change. Furthermore, in the development of the criterion function. 
Box and Hill have assumed the ratio of each pair of prior densities 
to be unity, while Atkinson and Cox (1974) feel that this should 
make sense only if the parameters in both the models have the 



20 


same number and similar interpretations. Box and Henson (1970) 
seem to have this fact in mind when they expected an improvement 
in the formula (l,4.7) through the use of an alternative 
expression for the density f ^ ; namely, 

f^fei+l^ = exp ^ — ^2 + P u ° 2 H (1.4.8) 

where is the residual sum of squares from (n+l) 

observations ,under model u which contains p u parameters , and 
c (u) is the coefficient that accounts for the parameter 
indetermination, Buzzi and Forzatti (1983) have instead noted 
that even if this modification is introduced in the Bayes' 
formula, the posterior probability as a measure of the quality 
of a model remains equivocal. Their experience indicates that 
a change in the order of the designed experiments may change 
the inference on model selection. The numerical problems in 
the calculation of model probabilities through the modified 
formula arise partially due to the exponential relation between 
f^ u ^ and s^ u ^ and partially because of the form of 

There are still more serious allegations on the Box-Hill 
procedure for model discrimination. Some researchers have 
even gone to the extent of remarking that this procedure can 
sometimes lead to false conclusions. Andrews (l97l) fears that 
the procedure may even put a true model with more parameters 
in danger if a simpler model is close to it. In a simple 
problem of discriminating between two models, Atkinson and 
Cox (1974) have also observed that the posterior probability. 



21 


as a model adequacy criterion, tends to prefer a model with 
fewer number of parameters, although from significance testing 
point of view both the models have been found to be adequate. 

Atkinson (1978), too, has pointed out that a simple 
Bayesian formula for the posterior probability of a model is 
misleading unless all the models have the same number of 
parameters. According to him, this formula leads to arbitrary 
inferences. He has, therefore, suggested an alternative 
quantity for choosing a model. Atkinson has further pointed 
out that the Box-Hill approach tends to force a choice between 
models } even if both the models fit badly, the better fitting 
model will eventually be chosen. In fact, the probability 
of such a model can be brought as close to unity as desired, 
by continuing experimentation long enough. It is, therefore, 
important to check the adequacy of the chosen model through a 
suitable method before it is finally selected for actual use. 

In case of nested models, even though all of them may be true, 
the behaviour of the posterior probability does not indicate 
this reality. Box and Hill (1967) have considered an example 
in which all the rival models are generalizations of the model 
assumed to be true and the criterion picks up the simplest model 
after 15 points have been designed. Siddik (1972) has further 
investigated this situation and found that the probability of 
the simplest model keeps oscillating between 0.85 and 0.95, 
instead of converging to 1.0, Froment and Mezaki (1970 ), 



22 


Wentziemer (1970), and Hill (1976) have also observed the 
posterior probabilities of models to be oscillating considerably 
from one experimental run to another and have warned that the 
criterion must be used cautiously. 

Finally, one notices that Box and Hill (1967) have, no 

doubt, catered for the variances of the estimated responses 

in the discriminatory design criterion, but have, at the same 

p 

time, introduced a quantity, o , the error variance, which 
is seldom known in practice. Froment and Mezaki (1970 ) have 
actually found this requirement a demanding one and instead 

O 

used a precise estimate of cr . But, such an alternative may 
not always be possible, especially when replications are not 
available. Hill and Hunter (1969) have suggested a modification 
of the Box-Hill criterion itself so as to bring the case of 
unknown variance under its purview. In order to arrive at the 
final form of the criterion function they have used approximations 
to the functions involved in the integral which is supposed to 
yield an upper bound, D, of the expected entropy change in the 
unknown variance case. Their criterion is, thus, an approxima- 
tion to D. The objection may, therefore, be raised to its 
accuracy. The criterion, however, looks quite similar to the 
Box-Hill criterion. On the other hand, Hosten and Froment (1976) 
have criticized the inclusion of the variances of predictions 
■vdiich include as one of the component. To them it appears 
that such a quantity unnecessarily complicates the expression 
for the design criterion rather than playing a significant role 
in designing a discriminatory experiment. 



23 


The work of Hsiang and Reilly (1971 ) is mainly based on 
the objection that the other criteria fail to allow for prior 
knowledge, if any, about the values of model parameters. 

Their personalistic approach utilizes all the information 
about the degree of belief that one has in rival models and the 
possible values of their parameters. They have incorporated 
this information in the form of subjective probability 
distributions. The procedure laid down by them consists of 
updating the probabilities of rival models as well as revising 
the parameter probabilities by Bayes* formula as data come to 
hand. So far as designing new experiments is concerned, Hsiang 
and Reilly have proposed the use of Roth's criterion function 
given by (1.4.3). The implementation of their discrimination 
method requires storage of discrete values of the parameters 
of all the models being considered. The method, therefore, 
is likely to present some computational difficulties if 
adequate computing facility is not available. Besides, the 
procedure being entirely based on a particular type of 
information about the competing models, it may not be possible 
to use it judiciously if in a given situation the required 
information is not available. The use of the proposed method 
is, therefore, restrictive. 

Buzzi and Forzatti (1983) belong to a class of researchers 
in the field of model discrimination who consider the 
participation of the variances of predictions to be important 
in designing experiments for discrimination. They have proposed 



24 


a criterion which is based on the ratio of two estimates of the 
variance of residuals under different models, corresponding to 
an observation due to be realized. According to them, the 
maximum expected discrimination is attainable at the maximum 
of this ratio, provided it is rendered greater than unity. To 
be specific, the new experiment in their sequential procedure 
may be conducted at a point which maximizes the criterion 
function 


^7 ^tn+1 ^ “ 


y~ y- / §(u)\ _ Jv) q ^)} 2 

o m p 

(m-l) {mo 2 + T o u (In+i)} ( 


(1.4.9) 


*2 2 * 2 / \ 
where o is an estimate of the error variance c and a ^ 

is an estimate of the variance of prediction made at 

through model u, given in Buzzi and Forzatti (1983 ) . 


There may be m models being discriminated but, at any 

stage the criterion includes only those models which are good 

from the point of view of adequacy. Since the inadequate models 

are not allowed to participate in the s election of a new 

experiment, it is essential to test each model for its consistency 

with the data available at that stage. To that purpose Buzzi 

and Forzatti have recommended the use of classical F-test*if 

2 

an estimate of o is available and Hartley^ F-max or Bartlett's 

2 

X >if this requirement is not met. Anyhow a model once dropped 
is likely to be included again if at a later stage it proves 
to be adequate. 



25 


The proposed procedure asks for suspension of designing 
for discrimination if at any stage Ay could not be maximized 
above unity by any setting of the input variable (s) in OR. It 
may, however, be possible to restart the discrimination process 
if some additional points could be made available for precise 
estimates of the parameters so as to minimize a • This 
necessitates designing for parameter estimation. According 
to the authors^ the discrimination potency of the procedure can 
be enhanced by concentrating on the pair of models which are 
most divergent in the sense used in their work. Besides, a 
final residual analysis for confirmation of adequacy of the 
selected model is recommended, in case the F-max or X statistic 
have been used for dropping the ill-fitting models during the 
course of sequential design of experiments. 

Buzzi and Forzatti (1983) have pointed out certain 

advantages which their criterion has over others, especially 

over that of Box and Hill (1967). For instance, the quantities 

~ p 

o ^ being in the denominator of Ay does not affect the criterion 
value, adversely. The criterion does not give undue importance 
to the divergence between variances of predictions at the cost 
of divergence between predictions. Besides, there is no 
danger of getting such experimental points which will provide 
a stronger evidence in favour of bad models as these models 
are kept out of picture while designing for discrimination. 
Finally, they have claimed that their discrimination procedure 
is equipped with a desirable stopping rule which some of the 
other criteria fail to possess. 



26 


1.4,2 Review of the Previous Work s Multiresponse Case 

The problem of model discrimination is aggravated if, in 
a given situation, the underlying process happens to be a 
multiresponse process. In this case, the data from the system 
come as measurements on two or more responses which must be 
considered together if one wishes to extract complete information 
contained in the observations. The situations of this type 
arise quite naturally in multicomponent processes, such as those 
involving complex chemical reactions or physical equilibria, 
etc,, and are not uncommon. The models structured for the 
systems based on these types of processes are usually complex 
and difficult to handle especially, when these models are 
nonlinear in parameters. It may also happen that one has to 
identify an appropriate model from a host of equally plausible 
models, before it can be used for studying a given system. The 
investigators in various disciplines frequently face this 
problem. But, unfortunately, there are only a few techniques 
available in literature which can be used for discriminating 
among multivariate models • We now discuss some of the salient 
features of these techniques. 

Roth (1965) is, probably, one of the first few who have 
considered the multivariate model discrimination problem. In 
order to design experiments for discriminating among m r- 
response models he has suggested the use of the weighted 
function 



27 


^Jn+l^ 


r m 

rriE 


i=l u=l 


p(u) 

n 





(1.4.10) 


where is the probability of the truth of model u (also 

n / \ / \ 

called prior probability) and is the m.l.e. of 0^ u , at 

the nth stage. The (n+l)th setting of the input variable (s) 
can be decided by maximizing Ag with respect to Tn 

their sequential procedure, whenever an observation on the 
r-response random vector is obtained at this designed point 
the investigator is also supposed to revise the probability of 
each model through the Bayes' formula. This formula for 
calculating the posterior probability makes use of the 
multivariate density of Y at the (n+l)th run. It may be noted 
that the proposed procedure is just sa extension of his 
univariate method and, likewise, neglects such errors in 
divergences as are likely to creep in, if the parameters of 
the models are not estimated precisely. The design criterion 
may also give undue importance to models with low probabilities. 
Besides, the posterior probability itself is not considered to 
be a reliable tool for making inferences on model adequacy. 


One of the several shortcomings that Hsiang and Reilly 
(l97l) have noticed in the Box- Hill procedure is its incapability 
to handle multiresponse situations. The problem of model 
discrimination in such situations, in their opinion, can also 
be tackled through the same Bayesian approach as has been 
suggested for the univariate models. If the prior information 



28 


in the form of model probabilities, the parameter distributions, 
and the multivariate density of Y under each model, is available 
then one could utilize it in designing experiments for the 
purpose of discrimination. In fact, it is their formula for 
calculating the posterior probability of a model which combines 
all this information. It may, however, be noted that this 
formula is different from that of Box and Hill (1967) but, is 
used for the same purpose, i.e., for revising the probability 
of the truth of each model at different stages. The criterion 
actually proposed for design of experiments is just the product 
of the values of given in (1.4.3), for each response. Like 
its univariate analogue this method, too, may present some 
computational difficulties which may be’ more serious in a 
multiresponse situation. The authors have given some suggestions 
for handling the storage of the data. The proposed procedure 
does not require error covariance matrix to be known in advance 
nor does it use linearizations, if the models being discriminated 
are nonlinear. The discrimination, however, is not possible 
through the procedure of Hsiang and Reilly if the required 
information is not available a priori. 


Hosten and Froment (1976) have proposed a procedure which 
is based on the elementary statistical principles. The design 
criterion suggested therein is a straightforward extension of 
their own uniresponse criterion with an additional provision of 
taking the precision of the responses into account. This has 
been done by incorporating the weights, w i (= where 



29 


is the variance of response i • According to them, the 
weighted function 


r m-1 

VW - £ w i & 


m 

ZI 

v=u+l 


T,(u) (6 

i ^Sn+1 


»2n 


(v) 




>n+l 


;cv)). 

»£n n> 


(1.4.11) 


with as the m.l.e. of provides us with a suitable 

measure of divergence and can, therefore, be used in designing 
a discriminatory experiment. As regards analysis of observations 
obtained from the experiment, Hosten and Froment have used the 
fact that the statistic 


■ 2(u) 


x=l a=l k=l x 


jk~ ^ 


v) 


(5 


k 




(1.4.12) 


is distributed like Chi-square with (nr-p ) degrees of freedom. 

u 

The quantity, used in equation (1.4.12) is an (ij)th element 


of the inverse of the covariance matrix of errors, 2 (assumed 

to be known in advance). Since the parameter estimation has 

to be carried out at each stage in order to make 9^ ' available 

for the criterion function at the subsequent stage, the m 

values of are generated as by-product and can be used for 

assessing the adequacy of the rival models. The status of each 

2 (u ) 

model is, thus, determined and a model with X v greater than 
the corresponding tabulated value is dropped once for all'. 


Some of the advantages of this discrimination method lie 
in the' simplicity of the calculations involved and in allowing 
even the nonlinear- models., although the statistic X^ u ^ ^ that 



30 


case would be approximately distributed like Chi-square. Besides, 
the model finally selected through this method does not suffer 
from lack of fit. The analysis of experiments, therefore, is 
not required to be supplemented with additional tests. The 
discrimination thus achieved has been claimed to result in 
saving of experimental runs. The procedure of Hosten and 
Froment has, however, the limitation of being available only 
for those situations in which S is known, a requirement which 
is rarely met in practice. 

While Roth (1965) has neglected the covariance structure 
of the estimated responses, Hosten and Froment (1976 ) are 
opposed to its inclusion in the criterion. Hill and Hunter 
(1967), on the contrary, have given due importance to the 
precision and the covariances of the predicted values 3 . They 
have been able to do so through an extension of the Box-Hill 
method to the multiresponse case. Likewise, Shannon's concept 
of entropy has been utilized in a multivariate set-up> In 
this case, too, it is the upper bound of the expected entropy 
change which has been exploited for arriving at a criterion 
function. In order to develop a usable function they have 
brought in the multivariate equivalents of all those assumptions 
which have been used by Box and Hill (1967) in the single 
response case. This has enabled them to arrive at an analogous 
expression for the discriminatory design criterion, given by 



31 


^ 10^1 


,n+l 


m-l m 

) = y~ H 

U=1 v=u+l 


p (u )p (v) [t r(Z r 3 
n n L v u v 


z r 1 - 

V u 


2I r ) 


<2 (u) <Sn + i-M u) >- ! W U-s! v) ) 1'( s »' 1 + 

{ !! <U) ( Jn+l>Sn U)) - ^ V) ^ +1 .^ V) )}1. &•*•«> 


_ 1 

where 2^ is the precision matrix of model u, IT is the inverse 
of Z u , and I is an rx r identity matrix. The criterion 
obviously makes use of the error structure of the estimated 
responses through the precision matrix, 2^. The decision on 
the model adequacy is still taken on the basis of the posterior 
probability, although it now involves the multivariate density 
of Y at the (n+l)th run. The method, overall, suffers from 
certain drawbacks as does its uniresponse analogue. To list 
a few, the criterion demands a known and constant covariance 
matrix, Z; the criterion tends to design points which may be 
more informative about the inadequate models; the oscillating 
behaviour of the posterior probability is misleading; and 
lastly, the procedure tends to fa v our the simpler model at the 
cost of the true one. Besides, all those objections which 
Hsiang and Reilly (1971 ) have raised against the Box- Hill 
procedure implicitly hold for its multiresponse version. Whereas 
many improvements have been devised for the Box-Hill criterion, 
not much has been done to remove one or more of the above 
mentioned shortcomings in the discrimination criterion proposed 
by Hill and Hunter (1967). 



32 


The work of Prasad and Rao (1977) is one of the few 
attempts made to improve upon this procedure. According to 
them, a more accurate and faster discrimination may be achieved 
if the expected likelihood is used in Hieu of the point likeli- 
hood for calculation of posterior probabilities of models. This 
not only results into a sharp discrimination but also selects 
better points through the criterion function, a^q-. The 
development of such an alternative expression for the model 
likelihood is based on the simple idea that likelihood, being 
a function of the parameters, is a random variable andean, 
therefore, be subjected to the expectation operator. So far 
as design of experiments is concerned, Prasad and Rao have 
recommended the use of Hill-Hunter criterion only. This means 
that the only difference between the 1 wo methods lies in the 
formulae for posterior probability : Hill and Hunter have used 
the point likelihood, while Prasad and Rao have recommended the 
use of the expected likelihood. The performance of the two 
alternative formulae has been compared by Prasad and Rao (1977) 
through an application to a practical exiample involving eleven 
bivariate models. With the s ame prior information and initial 
conditions, the use of the expected likelihood has been found 
to be more decisive so far as the rejection of the ill-fitting 
models is concerned. It has also been shorn through this 
example that the convergence towards complete discrimination 
is faster with the expected likelihood than it is with the point 
likelihood. The authors have, however, pointed out that the 



33 


superiority of their method depends on the models as well as 
on the type of data. The modification introduced by them 
seems to have improved the rate of discrimination of the 
multiresponse version of the Box>-Hill procedure, but has not 
removed the basic drawbacks. 


Buzzi and Forzatti (1983) have claimed to remove certain 
limitations and anomalies of the Box-Hill procedure for 
univariate models through an entirely different approach. 

The work of Buzzi et al, (1984) is a similar attempt in this 
direction for the multiresponse case. In fact, they have 
subjected the Hill-Hunter method to a similar criticism and 
developed a method for multiresponse model discrimination 
which happens to be a generalization of the univariate method 
of Buzzi and Forzatti (1983). The criterion is based on the 
minimization of the likelihood function of the divergences 
under any pair of models. As this is equivalent to maximizing 
the exponent involved, a convenient function for the purpose 
of designing discrimminatory experiments is considered to be 


- IS (u) <Sn + l.J9? ,) > 


-.(v)/* 

Z Ssn+l'Sn 


)1 5T 1 
u,v 



Wsi u) > 


- n(v) 




i 


(1.4.14) 


where = I + £ u + In a problem of discriminating among 

m r- variate models the next experiment, in the sequential 
procedure of Buzzi et al,, is recommended to be conducted at a 
point which maximizes over any pair of models, u,v, provided 


34 


^11 (|Ln+l ) S rea ’t er than the number of responses involved. 

The design for discrimination is continued till this inequality 
is satisfied. The emphasis is, thereafter* shifted to design 
of some experiments for precise estimation of model parameters. 
This may make the restart of the discrimination process possible* 
While the adequacy of all the models participating in 
discrimination, at each stage, is recommended to be checked 
by X -test, the procedure is finally wound up by carrying out 
residual analysis of the selected model. It may be noted that 
the implementation of this technique requires multivariate normal 
distribution of errors with known covariance matrix, T. Besides, 
the models being discriminated are required to be linear or 
linearized in the parameter space. The procedure shares all the 
advantages and disadvantages of its univariate analogue 5 . 



CHAPTER 2 


METHODOLOGY FOR MODEL DISCRIMINATION 
DESIGN AND ANALYSIS 


2.1 AN APPROACH TO THE PROBLEM OF MODEL DISCRIMINATION 

A mechanistic model is an abstract formulation aimed at 
simulating the mechanism of a real world phenomenon. The basic 
step involved in building such a model for a system is, therefore, 
to establish the true nature of its process. To that purpose, 
it may be borne in mind that so far as observations are 
concerned, randomness is somehow imbued into the process 
through one or the other sources , thus* rendering the process 
probabilistic in nature. Accordingly, if one endeavours to 
build a model for such a system one must consider the random 
error, in addition to the parameters (associated with the 
mechanism of the process) and a set of exogeneous variables 
(characterizing the input into the system). Mathematically, 
an r-response (r > l) system can, therefore, be reasonably 
represented by a set of r algebraic equations 

y ± = f)^(^j9) ^ , i = 1,2, .. . ,r* ( 2. 1. 1 ) 

where ^^(^,0) stands for the true value and y^ represents the 
observed value of the ith response, £ is a qxl vector of input 



36 


variables, e, a p xl vector, is representative of the physical 
parameters of the system, and. e^, assumed to be additive to 
the true response, accounts for the variability in observations. 

Model (2.1.1), to be denoted as is fundamentally a 

characterization of some random variable (s), Y, of interest, 
in terns of the parameters 0 and independent variable (s) £ 

while the observations arising from the system as a result of 
certain inputs can be presumed to be realizations of Y. Thus, 
whereas it is reasonable to assume a probability distribution 
associated with this model, it is as well justifiable to assume 
the existence of a population consisting of observations on the 
characteristic (s), Y, being studied in a given situation. Let 

A/ 

df^ 0 ^ be the probability distribution of Y corresponding to 
model and f}^°^ he the population which is supposed to 

have been generated by the -underlying process, n^°^. 


Unfortunately, a true mathematical model, such as (2.1.1), 
fully accounting for the underlying mechanism of the process is 
a hypothetical concept and can seldom be realized in non-trivial 
situations. All that one ought to look for is, therefore, a set 
of mathematical equation(s) which can behave considerably akin 
to the process, if not exactly. In other words, a simpler but 
closely approximating model which could describe the outstanding 
features of (2.1.1) is sought for. We consider a situation in 
which several models, m (say), have been hypothesized to. possess 
this qualification. These models are claimed to be capable of 



37 


describing the mechanism of the given system equally well and 
are, therefore, taken as rivals, so far as the problem of model 
identification is concerned. Each potential model in this case 
can be entertained as having possibly generated the data. In 
other words, all the rival models can be presumed to offer a 
satisfactory explanation for the observations being taken on 
the random variable (s), Y. It may thus be argued that Y can, 

*v> rvr 

alternatively, be characterized by model, (say), in terms 

of the parameters 0 and the input variables S * Thus, if 

/v 

we prefer to represent the given system through then we 

will have 


y i = (|^,0^ U ^) + , i = l,2,...,r. 


( 2 . 1 . 2 ) 


Let df be the probability distribution of Y associated with 


this model. 

Now, considered independently, each of these models, model 
u for instance, must be a true model for some process, fl^ u ^(say), 
which may be visualized as generating a population T1^ u ^ (say)-. 
Accordingly, the process can be represented by model u 

through r algebraic equations 


Xu) (u) 


(u) 


= *r<U vu '> + e 


(u) 


u 


X) 2| « • • 

1 f 2 f « m im f m 


where yf u ^ stands for the observed value and V 
represents the true value of the ith response. 


(r > l) , 

(m > 2), (2.1.3) 




) 


from fl 


(u) 



38 


depending on £ and a set of p parameters while 

represents the random factor operating on the process 
It may, however, be noted that the values y^ in (2.1 5 «3) cannot 
be actually observed because the process fl^ is not known Jit 
has been assumed to exist hypothetically. Whatsoever, by virtue 
of the argument used, the values simulated through (2.1‘. 2) may 
be treated as belonging to the population 

Now, the main purpose of building a mechanistic model is 
to be able to replace the given system by certain mathematical 
equation(s) which can simulate such observations as if these 
have been generated by the underlying process, itself. 

This suggests that a model is capable of best representing the 
given process if the population simulated by it is considerably 
close (in some statistical sense) to the population associated 
with the process Therefore, the discrepency of the model, 

(say), if any, can be seen through the dissimilarity of 
Tl (u) from n (o) . In other words, in order to check the credibility 
of model u we must look for the closeness between these two 
populations. Since the statistical populations, i.e., the 
type of populations being considered here,can be well 
characterized by their frequency distributions, the akinness 
between Fl^ 0 ^ and. f|^ u ^ can be appropriately assessed through the 
identicalness between their probability distributions. 

In a given situation the aim may also be to acquire a 
discriminatory observation, i.e., to append the available sample 



39 


of n observations with another, (n+l)th, which could add to the 
sharpness in discrimination between models and for 

instance. It may similarly be argued that in this case it is the 
divergence between the probability distributions df “ i and 

II 1 

dfn +1 (with reference to future), under models u and v, of 
some random variable (s), Y n+ ^, which may be utilized. In case 
of m models there will be m such probability distributions* -Any 
divergence among these probability distributions is expected to 
result into more clear a. disti nction among the underlying models. 

In this work, for the purpose of discrimination and 
designing experiments conducive to discrimination we, 
therefore, plan to employ a distance function through which 
we could evaluate the extent of identicalness between two 
probability distributions. 

2.2 SEARCH FOR AN APPROPRIATE FUNCTION 

Let ( x,<B ,v) be a measure space and (p be the set of 
all probability measures on <B which are absolutely continuous 
with respect to v . Consider two such probability measures, 

P-^,P 2 E ^ '*‘1 ,; ^2 be corresponding probability 

density functions such that f^ = dp^/dy and f 2 = dp^/dv. 

Then one of the measures of affinity between the probability 
distributions corresponding to f^ and f 2 is defined as 
[Hellinger (1909 )] 

h(f 1 »f 2 ) = / (f^) 1 / 2 dv . 


( 2 . 2 . 1 ) 



40 


But, symmetry is the only property which is satisfied by the 
function, h. Therefore, it does not qualify to be a metric-. 
Since specific to our requirement we need a distance function, 
h is not an appropriate function for discrimination. However, 
the function K, defined as 

K(f lf f 2 ) = [l - h(f 1 ,f 2 )] 1/2 , (2.2.2) 

can be verified to be a distance function and is, therefore, 
suitable for the present purpose. 

2.2.1 Metric Properties of "bhe Function, K 

We now prove the metric properties of K . 

(i) K(f 1 ,f 2 ) > 0. 

Clearly, 0 < fi » f 2 ~ ^ a « e « v • 

This leads to the inequality, 

0 < h(f lf f 2 ) < 1 a.e. v 

and hence to the inequality, 

K(fi»f 2 ) > 0 . 

In fact, we have 

0 < K(f 1 ,f 2 ) < 1. 

(ii) K(f^,f 2 ) = 0 if and only if f ^ = f 2 a.e. v . 



41 


Let f^ = f 2# Then 

h(fi,f 2 ) = f f ± dv 
= 1 . 

Therefore, 

K(f lf f 2 ) = {l - h(f 1 ,f 2 )} 1 / 2 
= 0 , 

Conversely, suppose K(f^,f 2 ) = 0. 

This implies {l-h(f-j. ,f 2 )} 1//2 = 0 

and, in turn, we have 
h(f 1# f 2 ) = 1 

which is possible only if f-^ = f 2 a.e. v . 

(iii) K(f lf f 2 ) = K(f 2 ,f 1 ). a.e. v . 

The symmetry of K follows from that of h. 

(iv) Consider three probability measures e<p which 

are absolutely continuous with respect to v , Let 
■^±*^2*^3 ^ ile corresponding probability density functions 
such that f i = dp.j/dv , i = 1,2,3. Then, 

< K (f 1 * f 2 ) + K(f 2 ,f 3 ), 

By definition, 

= {i - 

-fete- 2h(f 1 ,t 3 )} 1/Z 



42 


= J2 if + f - 2 f (f^) 1 / 2 ^} 1 / 2 
= X* {/ (fV 2 - fV 2 ) 2 dv - 2 / (f 1 f,) 1 / 2 dv} 1 / 2 

v2 1 .2 x p 

= -~ }/ (fij^ 2 - f^/ 2 ) 2 d^} 1 / 2 

= ~ [/ {(f^ 2 - f?/ 2 ) + ^ f 2^ 2 “ f 3^ 2 ^ 2 d ^ 1//2 * 

Using Minkowski* s inequality on the right hand side we get 

KCf^f^) < X- [{/ (fij-/ 2 - f^/ 2 ) 2 dy} 1//2 + {/ (rf/ 2 - f^ 2 )^} 1 / 2 ] 

= {l-hC^,^)} 1 / 2 + {l-h(f 2 ,f 3 )} l/2 , 

i.e. , 

K(f r ,f 3 ) < K(f lf f 2 ) +K(f 2 ,f 3 ). 

The properties (i), (ii), (iii), and (iv) show that K is a 
distance function. With this the search for an appropriate 
function is over. 

2.2.2 Utilizing the Distance Function. K. in Model Discrimination 

We now recall the hierarchical relation among the model 
M^ u \ the population and the corresponding probability 

distributions df^ u \ u = 0,1,2, Keeping in view this 
relation and the feet that K is a suitable measure of identicalness 
between two probability distributions, as established earlier, 
it can be appreciated that K may be utilized in measuring the 
dissimilarity between f)^°^ and any one of the m populations, 
n (l) , n (2) ,..., n (m) . This function may, therefore, be utilized 
in ariving at a scale for discriminating among the rival models-. 



43 


According to the concept used in this work, discrimination is, 
therefore, proposed to be done by means of a sample estimate 
of the distance between f|^ 0 ^ and]"]^ U ^> u = l,2,...,m. We will 
denote this estimate by K^ u \ when calculated through n values 
of Y. Besides, on the basis of the properties possessed by the 
function K, it may also be employed in diverging the probability 
distributions of the random variable (s), ^+ 1 * under the 
models being discriminated and hence in designing discriminating 
experiments. The distance between df^^ and df^^, in this 
case will be denoted as K , (I.-,), £ .. being the (n+l)th 

U f V ^ 4l 1TJL rsjrXlT± 

setting of the experiment ■ yet to be conducted, 

2.3 MODEL DISCRIMINATION : NONSEQUENTIAL 

In practice the investigator might have to choose a model 
from amongst a set of proposed models on the basis of a fixed 
number of observations only. Situations of this type arise 
when the experimental apparatus has already been dismanteled 
or the cost of conducting another run of the experiment is 
prohibitively high. This rules out the possibility of 
continuing experimentation beyond n ruins. Thus the decision 
on the choice of best model (s) has to be taken on the basis 
of whatever is available at hand. In this case whatever 
discrimination one could achieve through the available data 
would be worth it, for one would be working under such a 
constraint that furthering of discrimination through the addition 



44 


of more informative experiments is not possible. Therefore, 
even if one ends up with a reasonably small number of closest 
models, the rest of the task can be accomplished through other 
considerations, such as cost and convenience etc-, in using 
a particular model from the group of better ones’. In a given 
situation, if there are m models being discriminated, it may be 
possible to select a subset of (< m) models by comparing m 
values of K^, u = l,2,..,,m, while the selection of a single 
model may not be possible on the basis of a fixed number of 
observations. However, with m-^ as a small number (say 2 or 3) 
the final selection may be made through some of the practical 
considerations relevant to the given situation. 

2.4 MODEL DISCRIMINATION : SEQUENTIAL 

Suppose that one starts with a certain number of observations, 
n (say) but, these do not prove to be carrying enough information 
so as, to enable one to take a decision on the best model. Now, 
if unlike the previous case, there is a scope of additional 
experimentation, it may be possible to achieve a better 
discrimination, often arriving at a single model'. A few more 
observations appended, sequentially, to the sample of n are 
expected to do the job. However, in this case, in order to 
achieve this objective the investigator needs two criteria * 
a discrimination criterion, for making an assessment of the 
discrimination achieved at every stage so that only the minimum 



45 


number of experiments are conducted, and a design criterion 
which would suggest the setting of the input variable (s) at 
which an experiment, beyond nth stage, should be conducted so 
as to fetch more information for discrimination. 

2.4.1 Discrimination Criterion t An Index 


So far as the decision on the relative adequacy of models 
at the nth stage is concerned, we propose to use the statistic, 
D^ u \ defined as 


D 


(u) 


D (u) K (u) 
n-1 n 


n 


m 


v=l 


D (v) K (v) 
n-1 n 


u 


1 , 2 , 


• , 


m, 


(2.4.1) 


where D^^ is the value of the statistic for model u at the 
previous, (n-l)th, stage and measures the discrepancy of 

this model in explaining the mechanism of the given process 
at the current, nth, stage. The proposed statistic will, 
hereafter, be referred to as Discrimination Index. Since in model 
discrimination problems we start with the assumption that all 
the proposed models are equally plausible, it is reasonable to 
assume, initially, that these models present equally close 
approximations to the process model, M^ 0 ^, This amounts to 
assuming that = l/m, u = 1,2,..., nr. The discrimination 

index thus defined makes use of the discrimination realized 
at a given stage as well as the discrimination already achieved 
through the previous stages. Besides, the value of the index 
will lie between 0 and 1, with both values inclusive. 




46 


2.4.2 Design Criterion » A Weighted Function 

If at the nth stage, the m values of the discrimination 
index do not show enough evidence in favour of a single model 
and further experimentation is possible, the investigator must 
conduct a few more experiments specific to the purpose of 
discrimination. This necessitates the use of a design criterion. 
The criterion that we propose for designing discriminative 
experiments is based on the distance function K, as defined by 
(2.2.2). In fact, for deciding on the (n+l)th setting, we 

consider the random variable (s) Y n+ ^» on which a discriminatory 
observation is yet to be realized. Thus, in a problem of 
discriminating between two models, for example, one could obtain 
a design point by maximizing the function 


- a - (2*4.2) 

with respect to § n+ ^, where ^+1* 311(1 f n+l are P* d,Y * ,s o£ 
»n+l uncier models 1 and 2, respectively. But, if there are m 
models being discriminated there will be ( ® ) pairwise distances, 

VbW = a - 


with 311(1 ^n+l as "^ lle P*^* Y * J s °f 

respectively. ¥e will, alternatively. 


n+1 


under models u and v. 


1120 K u,v,nrt 


to denote the 


distance such as given by (2.4,3). According to the approach we 
plan to use here, the design point yielding the most informative 
experiment for bringing the best model to light would be the: one 
which simultaneously maximized all the distances But, 

unfortunately , a point which maximizes one distance may not 



47 


maximize another. There is, thus, the necessity of choosing 
appropriate weights which represent the importance of various 
distances in designing new experiments and then combining all 
the distances into a weighted function. The weights, in fact, 
determine the role of the distances, involved^ inn design of 
experiments. 

The Weights • The weights, which we plan to use, are based on 
the strategy that the models which are closer should receive 
more attention than the ones which are comparatively farther. 
This way a pair with the closest models in it would be attached 
with the highest weight while the one comprising of farthest 
models is given the least importance. One such set of weights 
may be proposed as 


w 


u,vjn 


D (u.) 

n 

1 n 

D (v) 

n 


T 2 D n 


TTI7 


if 

n — n 


otherwise, 


(2.4.4) 


where and are the corresponding normalizing constants. 

The Criterion function s So far as the criterion function is 
concerned, we prefer the use of the weighted average of the ( “ ) 
pairwise distances, the weights being given by equations(2.4.4). 
To be more specific, in a problem of designing experiments for 
discriminating among m models we propose to choose the next 
design point, |, n+1 > at the maximum of 


m-1 pi 

- hi %v,» Vv<W. 


(2.4.5) 



where K u v (|^ +1 ) is given by (2.4.35), It may be noted that 
the search for the maximum is always made over the operability 
region. An experiment conducted at such a setting of the 
input variable(s) is expected to yield the most informative 
(from discrimination point of view) observation on the 
response (s) of the system. 

2.5 DEVELOPMENT OF THE DISCRIMINATION CRITERION 

2.5.1 Some Assumptions s It has already been proposed that the 
discrimination among models can be done through the population, 
f| ^°\ associated with the process, on the one hand and 

the populations, FI supposed to have been 

simulated through the competing models on the other. Besides, 
it has also been argued that we have reasons to associate a 
probability distribution with each of these populations. In 
fact, we designated their associated distributions as df^°^j 
df^, df^,...,df^ m ^. In this section, we proceed further to 
specify the particular forms to these distributions and thus 
give a concrete shape to the laid down strategy for 
discrimination among the underlying models. 

It may be recalled that whereas the population, 
consists of the values of certain random variable(s) observed 
from the system, the elements of the population, n (u) , are the 
ones simulated through model, u = 1,2, ...,m. Theoretically 

speaking, these populations can be made infinitely large by 
continuing the experimentation and by carrying out the simulation 



49 


through the models, correspondingly. We assume that these 
populations have normal distributions! an assumption which 
can be justified in this case. Thus, in the univariate case, 
whereas we let 1"!^°^ have the distribution, (a \ tS°^ ) t "the 
distribution of fl^ may be specified as N^(a^ u \x^ u ^) , 
u = 1,2,... ,m. Similarly, -when the underlying populations 
are multivariate, we assume that f| ^ has an r- variate normal 
distribution : N r (a^ 0 ^,A^°^)» while the distribution 
N (a^ u ^,A^ u ^) is appropriate for the population 
u = l,2,,..,m. The development so far bring to light the 
important point that the proposed discrimination criterion 
is to be based on the identicalness of normal distributions 8 

N.. (a^X^) and N n (a ^ ,X (u ^ ) or N (a^,A^) and 

x ( ) ( ) x r ~ 

N (a^ U , A' ,U ')* u = 1,2, ...,m, according as the system is 
r 

uniresponse or multiresponse. We, therefore, derive below 
the explicit expressions for the distance function, K, 
defined for two p.d.f. 1 s under the Gaussian set-up, for 
the univariate and multivariate cases, respectively, 

2.5.2 Evaluating the Akinness Between Two Normal Populations 

Case 1 s Univariate Populations * 

' mmmmm—mmmmmm m mmmmmmd ij.in m i. mmmmmm tmmm ^ 

Result 1 : Let f]^ a ad fl^ be two univariate populations with 

normal probability distributions s N i( a i»^i) 311(1 N l^ a 2 , ^2^* 
respectively. Then the dissimilarity between these populations 
is given by 



50 


K(71 


]( n 2 ) = [1 - (^S) 1/4 «* {- g (*CV}]V2, 


where 


( h^2)i/2. 


(2.5.1) 


(2.5.2) 


Proof i Let and f 2 denote the p.d.f. *s of the random 
variable Y corresponding to the population, T 1 2 » 

respectively, so that 

( - ) 2 

f ± (y) = [ 2 n x i ]“ 1 / 2 exp [- \ -- 1 ], -00 < y < oo f 

i = 1 , 2 . (2.5.3) 

We first evaluate the function, h, given by 


h(f 1 ,f 2 ) = / [f 1 (y)f 2 (y )] 1//2 dy. 


(2.5’. 4) 


Using ( 2 . 5 . 3 ) 111 (2. 5. 4), we have 


*lAfo^-V2 


h(fl,f 2 ) = [\ 1 X 2 l“- L/ ^( 2 tc)' 


/ ex P [- H }]ay. 

R 1 4 X 1 X 2 


(2.5’. 5 ) 

By relation (A.l.ll) of Lemma A.l (Appendix A), the 
expression in the exponent of the integrand can be written as 

(y-a -,^) 2 (y-a 2 ) 2 (y-a *) 2 (a n -a 0 ) 2 


* 


2 ; 

+ , 


Xi + x 2 


a 


* _ X 2 a l * ^ g 2 


h +X 2 


where 


and 



51 


fyc. % 


.9.3199 


\* = (■ 


(- 1 — ?). 


(2.5.6) 


Using these relations in (2.5.5) ? we obtain 

2 , 

h ( f l ,f 2^ = ^1 X 2 exp [- 7j { — - — - — }] x 

X*l + ^2 

[ 2 rc( 2 X* )Y 1/f2 exp [~ | { ^ '-■}] dy 

R 

2 

= [x 1 X 2 .(2X*r 2 r 3/4 exp [- ^ { — — — ^-}], 

X 1 +x 2 


(2.5.7) 


Substituting for X* from equation (2.5.6) we get 


h(fl>£2 ) . exp [- i 


4 h x ; 


x 1 + X 2 


(2.5.8) 


'i* l/2 

Now, if we let X = (—=— 5 — — ) ' , then h can be written in a 


nice form like 


h( fl ,f 2 ) = [h^]V4 exp [. 1 (321^)2] 


so that, for the given situation, the distance function, K, 
defined in ( 2 . 2 . 2 ), can be expressed in the form 

k( n lt n 2 ) = [1 - (Nii ) 1/4 exp {- i Ai^) 2 }] 1 / 2 . 


8 ' X 


With a specific interest we also write an alternative 



52 


form of h(f^,f 2 )j namely, 

h(fi,f 2 ) = [-1— -J -1 / 4 exp [- | t-^T- 2 ) 2 ], (2.5.9) 

X 1 X 2 

so as to evaluate the function, 

L(n i ,Tl 2 ) ~ “l°Se * 

In fact, on using equation (2,5.9) we get 

L ( n i , n 2 ) -•% I ^Ca 1 ,a 2 ) + (2.5.10) 

where 

D i (a 1 ,a 2 ) = ( -"-■ — (a^ - a 2 ) 2 (2.5J1) 

and 

K + 

0^(3^*^) = 2 log e ( - lg ^8) - logg^! - lpg e . (2.5.12) 

Case 2 8 Multivariate Populations 


Result 2 i Let T! ! and D 2 be two r-variate normal 
populations with non singular r-dimensional probability 
distributions 8 ^(a^A^) and N r (a 2 ,A 2 ). Then the dissimilarity 

between j|^ and fl 2 is given by 


k ( n x , n 2 ) - 


Cl " ( '^]^' )1/4 exp { “ 5 ( S . - ^2 ) ' A_1( “l • a 2»] 1/2 - 

where 

A * (A^ + A 2 )/2. 


(2.5.15) 

(2.5.14) 



53 


Proof i Since the probability distribution of fl^is x'-variate 
normal : i = 1,2, the probability density functions, 

f]_ and f 2 , (say) of a random vector Y (say) are given by 

f ± ( X ) = t(2*) r Uil]" 172 exp {-1 i=l,P, 

(2.5.15) 

where a^ is rxr positive definite symmetric covariance matrix, 
|n ± l denotes the determinant, AT 1 stands for its inverse, and 
are ^-vectors. By definition, the distance function, K, 
is given by 

kCT1j. 1V = [i - / (qq) f 2 ( x )} 1/2 dyj 1/2 . (2.5.17) 

R r 

Substituting for f-^(y) and f 2 (y) from equations (2.5.15) in the 
function 

h(fi»f 2 ) = / {f 1 (y)f ? (y)} ly/2 dy , 

R r 

we have 

h(f 1 ,f 2 ) = [(2lt) 2r |A 1 I l^l] -1 / 4 

/ expC-^y-^)' ’ n i 1( K-^U <&• 

R (2.5.17) 

Using equations (A.l.l), (A. 1 . 2 ), and (A.1.3) 0 f Lemma A. 1 

(Appendix A), we can write 

(y~ O * aT 1 (y- 0 ^ ) + (y-ap)’ A” 1 (y-ou) = 

^ ^JL JL fv rs/JL a/ e^jCL csj i*s jci. 

( X -0») A* ( X -of) + (a^)’ C<C L +a 2 ) _1 


54 


where a *= (A^Ag)” 1 (A 2 « 1 + /V^) 

and A* = A^ 1 (a 2 + A £ ) A^ 1 . 

These relations when used in (2.5.17) give 


h(f i’ f 2 ) = ex P (rt a + '' 2 ) " 1 ( 5 l-S2 )] ' 

|a 1+ a 2 | 

And, if we^let A = (A^ + A 2 )/2,i.e,, pool the covariance matrices, 
then h can be written as 


h(fi,f 2 ) - Mhi-rVV [-i ify-sj r 1 (orfz)]-. 

IaiA 2 I 


Accordingly from (2. 2. 2) we obtain the dissimilarity between 
fl^ and ]“[ 2 in the form 

K %.>IV “ [l-( -1 J .)V 4 exp{- -|( ( j 1 -a 2 )' A -1 (a 1 -a 2 )}] 1 /2 f 

„ |A| (2.5.19) 

where 

A = (a 1 +a 2 )/2. (2.5.20) 

In addition, we find that the function L, defined as 


un^ry - -i°« e 


becomes 


L (!!•!_,' n 2 )-gD :1 _(a 1 ,a 2 ) +-^ d 2(A 1 ,A 2 ), 

VVfe 5 - T 2) " 1 ( ^2 )] 


(2.5.21) 


where 


(2.5.22) 



55 


A-.+/V, 

and DC/Vj^^) = 2 log e ( | )-log e | /Vj_ I - log | | • (2.5.23) 

2.5’. 3 An Important Observation 


Although our main interest in both the cases • univariate 
and multivariate, is in the function K, the function L, given 
in equations (2.5.10) and (2.5.21) ,has been worked out in order 
to bring out an important point. Consider again the populations 
f|^,and jTJ^. Since these populations have been assumed to be 
r-variate normal statistical populations (r > l^each can be 
described by an r~ dimensional normal probability distribution. 

To be more specific, the population f"|p can be completely 

specified by the set of ] parameters, 


^ a ll ,a 12* * * * ,a lr* X 111 ,X 122» •••>^irr* X 112 ,X il3' * * * ,X l(r^l)r^ * 


of which the first r are the location parameters and the 
remaining indicate the orientation of 11^. Similarly, T1 ^ can 
also be specified by another set of [ r -£ - £t . 3 . 2 ] parameters, 

^ a 21 ,a 22 , ••• ,a 2r , ‘*2ll ,X 222* * * * ,X 2rr** X 212 ,X 213' * * * * * 

where * * * ,a 2r^ specify the location and the remaining 

describe the orientation of f] 2 » This shows that the dissimilarity 
between andf^ and, for that matter, between any two normal 
populations can be judged through the disagreement between the 
two types of parameters; namely, the location and the orientation 
of these populations. Now, an examination of the expressions 
(2.5.1°) and (2.5.21) for LCf^,!"^) shows that the first component 



56 


measures the dissimilarity of the two populations with respect 
to their locations, while the second component distinguishes 
these in terms of their orientations. Thus the function K 
seems to do the job. 

2.5.4 Distance for Assessment of the Appropriateness of a 
Model 

Univariate Case s 


Suppose that, to start with, we have n observations* 
y = (y 1 »y 2 »«**»y n )» at our disposal. We presume this set of 
values to be a sample coming from the population 
Correspondingly, another set y^ = (y-f u ^ »y« U \ • • • *y^ U ^) may 
be simulated through model M' u ^ and identified as a sample from 
the population TJ^ U \ Now the distributions of 3X1(1 Tl^ 

being N(a^°\ }S 0 ^) and N( a ^ u \x^ u ^), respectively, the 
dissimilarity between the parent populations of the two samples 

is given by [Refer equations (2.5.1) and (2.5.2) of Section 2.5] 


, v f . 4 X (o) X (u) 

k( n (o) , n (u) ) = [i-{n[o7 


IX' 


+ x 


TuJj2 


•} 1/4 {- rr-[57 


4(X 


(cc (o) -a (u) ) 2 

W T) u- 


+ V 


(2.5.24) 

A sample estimate of K of 0 ', n w ) and hence of the discrepency 
of model u in explaining the mechanism of the process may thus 
be proposed as 


K 


(u) 


n 


= [!-{ 


4 s. 


2 _2 


n 


(s^+s^ ) 

\ i i v» • 


s u 1 n 2 jl/4 


exp 


{- 


i 




u f n ' 


n 


W ni/2 

s 2 ) 1 ' 

u,n y 


(2.5. 25) 


57 


where 


and 


4 - z 1(1 - 1 -Vx * y n = n z' j 


s 2 - y (u) 
s u,n “ Z 


nl 


(I-i J V u) , y =iy (u) 'j 1 

v n m i * ^u,n n >o nl 


with J ^ as an a xb matrix of unit elements. In fact such an 
estimate is obtained by plugging in the estimates of the 
unknown quantities, a^°\ )S 0 ^ , and involved in 

(2.5.24) and will be termed as plug- in- estimate. 


Multivariate Case s When the system being investigated is 
multiresponse, the n sets of observations, y = (y^ ,y 0 » *•» * *30 » 

£2 rvJL roj JL 

arisen from the system may be considered to form a sample 
from the population while the n sets of simulated 

observations, y^ = (y^ u \y^, . . . ,y^ ) * may be treated as 
a sample from n (u) , u = 1,2, , ,.,m. In this case, since the 
parent populations have been assumed to have normal distributions, 
, A^°^) and N (a^ u ^,A^), respectively, the required 

X rv# X r\J 

dissimilarity is given by [Refer equations (2,5.13) and (2.5^) 
of Section 2.5] 

k(h ( o) , n (u) ) = 


4|a (o) a ( u) I 



exp {- ^ (a 


(o) 


a (u) )tft (o) ♦ n (u) ) _1 (a (o) - a (u) )}] 1/2 . 

^ fss - a ; . 


(2.5.26) 



58 


Therefore, a plug- in-estimate for assessing the appropriateness 
of model u for the process is given by 



(2.5-. 27) 


where 


= l' vl 0) - I X' J : 


= y' u >' (I - 1 Jjy<") , y' U) = i y (u > ' J 


n nn 


Having calculated the estimates of the distances 
K( n (o) , n (u) ) from (2.5.25) or (2.5.27) according as the system 
is uniresponse or multiresponse, a corresponding change in the 
discrimination index, defined in (2.4.1), would of course be 
the next logical step so as to have an over all picture of the 
discrimination achieved through the available observations at 
the nth stage. 


' 2.6 TERMINATION CRITERION 

In sequential design of experiments for discrimination, 
at each stage, it is important to assess the level of 
discrimination achieved upto that stage. It has been proposed 



59 


to do so through the discrimination index defined in (2.4.1). 
Equally important is the decision on termination of the 
sequential process at the point at which one’s goal has been met. 
Such a decision has considerable significance especially, when 
the designed experiments are to be conducted on a real life 
system, - where each run of the experiment may be extremely 
costly. The necessity, therefore, is of a stopping rule through 
which one could stop at the right stage and avoid unnecessary 
exp erimentation. 

According to the approach used here, of the two models we 

prefer the one which has smaller value of the discrimination 

index, indicating that this model is closer to the true model; 
the assessment being based on the distance between the 

corresponding populations. Thus when m models are being 

discriminated, a stage would always be awaitted when the models 

close to the true model correspond to populations which are 

sufficiently apart from one another. 


In the present approach, the decision on the termination 
of the sequential procedure can normally be taken on the basis 
of subjective judgement, i.e. f through a comparison of the 
values attained by the discrimination index at a particular 
stage. But, there may be situations in which the best 
model does not differ much from its closest rival, in terms of 
the distance. In order to tackle these types of situtions we 
propose to stop at the stage n* (say)* if 


D 


ABS[ 


_(u+) 

n* 


D (u + ) 
(n*-l ) 


D (u + ) + D (u») 
n* n* 


D (u + ) +D (u*) 

(n*-l) (n*-l ) 


] < 0.001 


( 2 . 6 . 1 ) 



60 


where D^ u *^ , are the values of the discrimination 

(n*-l) n* 

index for model M^ u ^ at two consecutive stages and u*,u + stand 
for the best and the second best models, respectively* 

2.6 TESTING TEE SIGNIFICANCE OF THE 
DISTANCE BETWEEN TWO MODELS 

In the sense pertinent to this study, the distance between 
two models, M^ and (say), can be measured through the 
distance between their associated populations, Fl^ nnd- Fl 2 (say) ? . 
Therefore, testing the significance of the distance between 
two models would amount to confirming whether the two 
populations, Fl 2 » are c l° se enough so that the 

corresponding models, and M 2 , may be considered as substitutes 
of each other. This can be done by testing the hypothesis. 

Ho * k( n ! * n 2 )- 0, (2.7.1) 

or, equivalently, by testing the hypothesis 

Hon. * u n x , n 2 ) =0, (2.7.2) 

where L ( F| T1 2 ) is given by (2.5.10) or (2.5.21), according 
as the populations are univarita or multivariate. The 
equivalence of Hq and follows from the fict that 

K ( r\ lt fl 2 ) = 0, if and only if [l - exp {-L( T\ ± t FI 2 )}] l/2 »0/' 

i.e., if and only if [exp {-L( # Fl 2 )}] = 1» 

i.e., if and only if L( » F1 2 ) = 0. 


61 


tulti variate Case? 

We will, therefore, formulate a test for the hypothesis, 
rhis will be done for the multivariate case firstjthe one for 
the univariate case goes in parallel. A look at equation (2.5.21) 
further suggests that L( |“j H 2 ) = 0 331(1 onl y D i ^i»%2^ “ 0 

and D 2 (Ai, A 2 ) =0. Therefore, the required confirmation about 
the similarity between and f J 2 can be done by testing the 

lypothesis 


h-Q 2 • — a 2 and A = A 2 « (2.7*3) 

Do that purpose we consider two hypotheses; namely, A-^ - A 2 
md H 2 t a-^ = a 2 , given = A 2 . Besides, let q be the 
unrestricted parameter space, i.e., q = {^»® 2 » A^»A 2 ) and 

% = ^~1 , ~2 ,a 1 ,a 2^1 = A 2^> q 2 = %l»^2 ,A l ,A 2^“l = ~2 ,A 1 = A 2^ 


>e two restricted parameter spaces. Suppose the r-variate 
•andom vector Y has p.d.f. , f(y*u>), where m is a parameter 

^ rv pj 

'Oint in the space q . In that case the hypothesis is that 
belongs to the space q^; the hypothesis H 2 means that u falls 
n q 2 , given that it is in q 1 o q 2 ; and finally, Hq 2 is the 
ypothesis that a is a point of q 2 , given that it is in q 
uch a formulation enables us to use the lemma due to Anderson 
1957). We thus propose to use the statistic, W^, given by 

(„)iV2 |! 1*1/2 IX | *2/2 

W X 7 rv • (2-7.4) 

| A *|V/ 2 (o^j l/ 2 (^) 2/ 2 

or testing the hypothesis under consideration, where is 
tie number of degrees of freedom associated with the estimate 
1 of A u and A* is given by 


62 


A* = 2- tA + n(y -y) (y -y )' ] (2.7.5) 

^ XjU k , 

with Xu - ( ZuJ m )/n« l = ^1^/2, u - 1,2 

AT * 

J m is the nx n matrix with all unit elements, and v in (2.7.4) 
is the sum, v ^ 

Using Box (1949) the distribution of W. under H Q2 is given by 


P(-2p log e WjjC z) =P(X|< z) + r.{P(x| + 4 < z)-P(x|< 2)}-0(n” 3 ), 

(2.7.6) 

where 


p a l. Jl+U) + — 

p Vt v' l 6(r+3) j v 


'1 2 
d = r(r+l)/2 




7 


= — S [6(r+2)(r + l)(rwl) (-i + J* -4s) 

288 P* 


*1 


V*- 

2 


. JteaaaJL lz (A + -L . 2)2 

r+3 ^2 ^ 

. i 2 iZ Z 2 *prl)? (JL + i. . l) 

v (r+3 ) V v- L v v' 


- 36 


(r+3)^ 2 


+ 24—1], 


(2.7.7) 

(2.7.8) 


(2.7.9) 


and n is the number of observations on the basis of which the 
lecision about the akinness of FI2 Tlj is to be taken. 

Under the hypothesis %2» ^^isiic V, defined as 



63 


V = -2P log e W 1 , (2.7.10) 

O 

is distributed, approximately according to X distribution 
with d degrees of freedom. This approximation can be safely 
used if r is small, say of the order of 0,001. 

Univariate test s A study of the equation (2.5.16), in this 
case, too, suggests that an equivalent hypothesis would be 

= a 2 ^i = ^2 * (2.7.11) 


so that, having defined similar type of parametric spaces and 
formulating the same set of hypothesis we may propose to utilize 
the statistic [Anderson (1957)] 


¥ = 
2 


, a , y i/2 a v 2 /2 
( V 1 X 1 ) ^ 2^2 


(2.7.12) 


r A A n i n 2 /* A 

[) Vi + Vs + (a i- a z ) 3 


A , 2 ( y » 2 )/2 


for testing the given hypothesis, where “]_#«2 , ^1 * ^2 are 

the appropriate estimates of and respectively 

A A 

and are ^- e S r 6 e s of freedom associated with and Xg* 

respectively. 

So far as the distribution is concerned, using Box (1949), 


the .statistic, F, given by 
F 


Td 


d 2 ¥ 2 


* (2.7.13) 


is distributed as Senedecor' s F with d^ and d^ degrees of freedom, 
where 

v-1 


df = ! , d 2 = 3/ P , b = d 2 (l - p + 2/d 2 r x t 


and 




Vs 


- rr-tr-] • 


v l* v 2 


64 


When P assumes the form (’jirjj)* s0 ^at ^ 

(2.7.15) becomes 

12o>3(6 co 2 — 5 os + 1) W 0 

F = 7- --5 (2.7.14) 

[72 gT - (6 gj — 3m+l)W 2 ] 

and has an F- distribution with 1 and (12 to 2 ) degrees of freedom 
where to = ( v ^ + v^)/2, 

2.8 A MODEL ADEQUACY CRITERION 

Suppose that in a given situation a single model, u (say), 
has been chosen from a host of m models through one of the two 
approaches; namely, sequential and nonsequential. Before this 
model is finally accepted for actual use it should be examined 
for its adequacy for the system. According to the concept 
utilized in this work, the problem of testing the adequacy of a 
mechanistic model and for that reason of may be viewed 

from a different angle. In fact, once we have visualized |"|(°) 
as the population generated by the system and T1^ U ^ as the 
population simulated by the model f this model may be 

declared adequate for the system if the two populations are 
sufficiently close to each other, in the prevailing sense of 
the term. This can be checked by testing the significance of 
the distance K( I"I^ U ^ ) ,i.e.,by testing the hypothesis 

«0 ! K n U> = 0 ■ 

where n denotes the number of observations available upto the 
stage where such a confirmation is required. As proposed in 



65 


Section 2.7 such an hypothesis can "be tested through 
the statistic V, given in (2.7.10) if the model is multivariate. 
In case the model meant for a single response system is to he 
tested for adequacy, the univariate analogue of V or the 
statistic F of equation (2,7.14) may be used. In any case the 
acceptance of will lead to the conclusion that the model 
in question is capable of describing the mechanism of the given 
system. 


CHAPTER 3 


DESIGN OF EXPERIMENTS FOR MODEL DISCRIMINATION 
IN UNIRESPONSE SYSTEMS 


3.1 BASIC ASSUMPTIONS 

In Chapter 2 we proposed that the criterion function, 

0 (l^+l ) » of (2.4.5) could be employed for searching an optimum 
(from discrimination point of view) set of value (s) of the 
input variable (s). In the present chapter we will derive 
particular forms of this function under different sets of 
assumptions, one may be permitted to make in a given situation 
while designing experiments for a single response system’. Now, 
there are certain basic assumptions required for the development 
of the these forms of To list these, we consider the 

model, M^ u ^ (say), i.e. , the equation 

Y k - ^ u) (Lk»3 (u)) + 4 u) (5.1’.l) 

and assume that 

B(i) E (u) (e£ u) ) =0 , k = l,2,...,n, 

B(ii) E^(ef u ^e^) « 0 , k = 1,2,..., nj 

B(iii) E (u) (s£ u)2 ) , k = 1,2, .. . ,n, 

B(iv) the error is distributed as N^(c^o^), 


Mk $ 



67 


and that 

B(v) the model u, if nonlinear, can be linearized in the 
parameter space; 

the assumptions being applicable to all the m rival models. 

3.2 CASE 1 i KNOWN, HOMOGENEOUS VARIANCES OF ERRORS 

We first consider the case where, in addition to the basic 
assumptions B(i) through B(v) of Section 3.1, it may be reasonable 
to assume that 

Cl(i) o 2 = a 2 = **• = a m ** ° 2) » 

Cl(ii) is known. 

As proposed earlier, the designing of an additional 
experiment, £ n+ ^ (say), requires the use of the criterion function 

m-1 m 

^ln+1^ = ^ w u,v,n K u,vjn+1* 

The weights, w u v?n » in this function are supposed to have been 
obtained through the values of the discrimination index at the 
nth stage. As regards the specification of the form of v - fn+ i 
pertinent to the present set of assumptions, we consider the 
random variable, Y n+ j» and seek its alternative probability 
distributions under models u and v. 

In the present set-up, given fen+l'S f=M^(say)] 

and a 2 , one can straight write the p.d.f. of Y n+ ^» under model 
u, as 



68 


..(u) 

* <u> <W4"i* o2 > r <^ 2 >' 1/2 exp{-i t 7 ^ 1 ^ ) 2 }, 

(3.2>.l) 

But, while at the nth stage, is unknown. Therefore, what 

we actually need is the p.d.f. of Y n+1 » given o 2 only. We may, 
however, make use of the information available through the 
previous n observations. This justifies the use of the 
Bayesian approach. 

Now, making use of the assumption B(i) in model (3.1>l)» 
we have 

E (u) (Y ) = n (u) (£ e (u) ) . (3.2.2) 

*V /vA 

We will be dealing with a general case if the response function 
is considered to be nonlinear in parameters. Accordingly, we 
take fj as a nonlinear function in terms of 9 ^ . However, 
the assumption B(v) in that case permits replacement of the 


response function T)^ ) in (3.2.2) by its linear 

approximation around some value of 0 ^ in the parameter space. 

*v> 

*( u ) 

We choose this value as the m.l.e, 6 of 0 V ' and assume 

A-» rv/ 

as well that the estimates, 0' , t = 1,2, ...,p„, are fairly 

close to the model parameters, so that the above approximation 
may be justified. This enables us to write 

fa 


E (u) (Y k ) 


(u) (£ k .^ u) ) + ^ 


«(u) 


(e-T' - ero 


(u)) (u) 

t ' x kt 9 


k = 1,2, ...,n,n+l, 


(3.2.3) 


69 


where 


X (U) _ [ ^ (u) ftk’g (U) > ] 

** " 1 HP e (u) =s u . 


(3*2*4) 


(It may be noted that equation (3.2.3) remains valid even when 
the response function is linear in parameters.) Alternatively, 
we can write (3.2.3) as 

E (u)(Y k ) _ £(u) = ^ U), (0( U )-0( U) ), k = 1,2, * . • ,n,n+l, ... 5 

(3.2.5) 


where 


and 


y( u ) _ (t,(u) (u) (u) •* 

fk ~ ,X k2 ' * * * * Kp u ' 


with x^ given by (3*2.4). 

In particular, we have the identity 


(3.2.6) 


(3.2.7) 


,(u) -(u) 

n+1 y n+l 


(e (u) -e (u) ). 


(3.2.8) 


Now, the p.d.f . of Y n+1 * under model u, given o and n 
observations, ^ = (y^»y 2 > « • • #y n ) » can be obtained from the 
formula 


Rr 


n+1 7 ^n+l 1 


n+1 7 7 n+l' 

(3.2.9) 


Of the two densities involved in the integrand, the first is 
specified in (3.2.1), while the second is not known as yet 5 . 
However, the relation (3.2.8) suggests that the distribution of 


70 


^+1 wil ^ t>e the same as that of ' Q^ . The distribution 

~n ± ^ 

(posterior) of q' u ' must, therefore, be obtained, as a first 

/v 

step. To that purpose we consider the likelihood function of 

(u N 

the parameters 9 V J based on n independent observations, y,i-.e., 
the function 


L(©^ u Vy»o 2 ) = (2n o 2 )“ n / 2 exp [ i- e^], 

^ ^ go ^ ^ ^ 


where = (ei u ^,e^ u ^ 


,.,.,e^ u ^) with e vu; as in (3,1.1). 


1 ,- 2 / — — - k 


(u) 


Using the linear form (3.2.5), we have 


i.e. , 


y k -E (u) (Y k ) - <y k -^) - 

4 U) - 4 U) - ^ u),( ® (u) -i (u)) > 


(u) ^(u) 

e k - ^k - ^k 


where 

The likelihood function, therefore, becomes 
L(e^ U Vy, a2 ) = (2it a 2 )” 11 / 2 x 

A/ 

exp[- -h { e (u) -x (u ke (u) -9 (u) n- 

{ e (u)_ x (u) (e (u)_|(u) )}] 

- ■ r>J * 1 


where e^ is an nxl vector of the discrepencies, e5 u \and X^ 

r rsj K 

is an n x p matrix .of the partial derivatives, x^i\ k = 1,2, ...,n> 


t = l,2,...,p u , defined in (3.2.4). Now the use of m-.l^e. 9 
renders e^ u ^ X^ u ^ = 0 which in turn reduces the likelihood 


(u) 


function to the form 


71 


L(9 (u) /y,o 2 ) 


^ /V 


(2Jt c 2 )- 1 ^ 2 x 

e*p[- -i, {e (u) ' e (u V( e (u >-e^)'x (u) 'x (u) (9 {u) -e (u) )}]-. 

/V /V ~ ~ ‘ * 


2a 


rv-/ 


rv/ rs> 


( 3 . 2 . 10 ) 


The ignorance about the model parameters can be taken into 
account by assuming uniform prior distribution of ©^ u ^. The 

/v 

posterior density of 0 is, then, given by 
, (u), 2 \ l(9 (u) /y,° 2 ) 

V£,° ) - . 


/ L(e^)/y,o 4 )de' 

n rv ro rv 

R e 


(3.2.11) 


where R Q denotes the parameter space for model u. The use of L 
from (3.2.10) in the formula (3.2.11), immediately gives 

£(e^ u) /y,° 2 ) = C(2rt a 2 ) Pu 


A/ AJ 


exp [_ 1 {( 0 < u >-e (u) ) ' x< u > (u) - e (u) ) } ] . 

p0 ^ /v /v rv/ rv/ 

( 3 . 2 . 12 ) 

This shows that is distributed as Np (0^ u \ (X^ V u ^ T 1 0 2 ), 


So from the relation (3.2.8) we conclude that is distributed 

normally with mean and variance [X^ ' (X^ ‘x^)" 1 X^ 0 2 -]< 

The p.d.f. of JU-^I , given 0 2 , can accordingly be written as 

g (u)( n^/d 2 )= (at o\r 1/2 exp[- - yi"lf }.], 


?a 2 lv ~n+l ^n+l' 
& u 


(3.2.13) 



72 


where 


- v(u)* 
-n+1 


\ = mi (X (u) ' X 


W) \“1 y(u) 

; ~n+l' 


(3.2.14) 


Substituting for £ ^ (yn+^/dj^o^from (3.2.1) and for 

(b^+^/ a? ‘ )from (3.2.13) in the formula (3.2.9) , we get 


f (u) (y Jo 2 ) = 

fl+l^n+r 

[(2n: o 2 ) 2 zj*" 1 / 2 


f exp [- 


( u (u) - V } 2 ri/( u ) C".(u) \ 2 

r 1 ( __n+l y n+l^ ^n+1 y n+l^ 

t** **5 v " l,,lllL " 1 5 -- ■■ ■ ■+• o , 


0 2 z. 


» Jl " 


p* o‘ 

R u (3.2.15) 

Using (A.l.ll) and (A.l.12) of Corollary A.l (Appendix A) , we 


can write 


[ 


(u< U > - y V 

VM h+l yy>^ > 


n+1' 


co‘ 


u 


J, 7 n+1 

ni 


+ - ^> 2 j . 

° 2 Zu 

(>) - M ) 2 + , 


CO 


u 


where M = [o^Vn+i + y^)A> 2 ] 
and 

w u = cr(i + zj 1 / 2 . (3.2.16) 

As a result, from (3.2.15), we get 

A (u) 

f l“i ( W ff2) = (2"“u>~ 1/2 H (V 0 U " Yl?t; ) 2] • (3.2.17) 

This leads to the conclusion that, under model u, Y n+ ^, in the 
present set-up, is distributed as ^n+1 ,Q u^ * S; Ufiil a nly, under 
lodel v(v £ u), Y n+1 will be distributed as ^(y^,^ 2 ) so that 



73 


the p.d.f, of ^ n+ ]_» under model v, given cr 2 and n observation, 
can be 'written as 

( ^ A (v) 

f »+i ( W° a ) - “v )-1/2 ex P r- 1 (-- aft , ~ Zaa ) 2 ], ( 3 . 2 .is) 

V 

where co v = cr(i + z^ 2 . 

Using (3.2.17) and (3.2.18) in the relation 


=/, ^(y a /)^( Wo2 ) f/? dy , 


■ y n+l> 


we get 


- rc *> 2 ® 2 ^- 1/4 


/ exp[- ^{-3±l 
R 1 ®u 


Now by Corollary A.l of Appendix A 


I f (y n+l ~ ^ j ) 2 . ( y n -H ~ & 


-> ]d y n+ i • 

(3.2.19) 


we can write 


iZsio-®\ (y n+ i - Inll? <y n+ i - yn +1 ) 2 - y ilb 2 


n+1 y n+1 - 


where W = (® 2 y£l * ® 2 y*g>/fcS ♦ -|) and 

“ = {“u * ^)) 1/2 . 

Using (3.2.20)in (3.2.19 ), we get 


r n-KL /n+1" 

+ to ^ 

U V 

(3.2.20) 


(3.2.21) 


74 





}] 


i.e. , 



H 


^(u) x-(v) \ 
I n+l» I n+l ; 



(y (u ] 

s ■'n+l 

*■ O 


CO 


u 



Substituting f or w from (3.2.20) we finally get 
h (f fCv)) r -.1/4 r 1 r^n+1 "* ^n+l^i ] 

vw w - ^ * { % 


(3.2.22) 


We conclude this section by proposing that the appropriate 
design criterion, when <f is known and the models are linear 
or can be linearized, consists of maximizing the function 


0 ^in+1^ “ 


m-1 m 

imm ' 

U=1 V^U+1 


w 

u f Vfn 


[1 - 


2co (o 

^ u y 

0)2 + C0‘ 

u 


V 


1/2 


exp 



(y(u) « y(v))2 

a m_W 


CD + CO 
U 


)}3 i/a 


V 


(3.2.23) 

with respect to within the operability region. 


3.3 CASE 2 S UNKNOWN, HOMOGENEOUS VARIANCES OF ERRORS 

In actual practice, the information about the variance of 
errors is often lacking. However, the normality and additivity 



75 


of errors as well as the linearization of nonlinear models may- 
still "be possible. In the present case, therefore, we retain 
all the assumptions B(i) through B(v) of Section 5,1, In 
addition, we assume that 


C2(i) °1 = °2 ~ ' = °m <=° 2 ) ’ 

C2(ii) ° 2 is unknown. 


Now in order to "be able to use the objective function 
(2.4.5) for designing new experiments we derive the form of 
K u Vjn+ ^» relevant to the present case. To that purpose, we 
first consider a random variable, Y and seek its p.d.f . (with 
reference to future), under models u and v, when all the above 
mentioned assumptions hold good. Of all these, if the assumption 
C2(ii) is suppressed for the time being, then through the 
discussion in Section 3.2 we know that the p.d.f 3 , of Y n + 1> 
under model u, given o 2 (== A (say)) can be written as [Refer 
equation (3.2.17)] 


4+ifrn+:iA> - [zn: * (i+ z u ) r y2 W [- ■§< >3 , 


(3.3.1) 

where y is the set of previous n observations and 

■ s£a tx(u) ' x(u)]_1 j£a ■ (3.3.2) 

while the vector and the matrix are defined earlier 


in Section 3.2. 


76 


Since in the situation under consideration X is unknown, 
we use some appropriate prior for X and consider the marginal 
p.d.f. of Y n+1 , i.e., 

4+ifrn+l/y) “ { f(u)( y n +l^). s(A/y) , (3.3.3) 

where gOv/y) is the posterior density of X , given a sample of 
n observations. We use a noninformative prior for x • According 
to Jeffrey's (1961 ) rule, the prior distribution- of a parameter,©* 

t® 

(say), is approximately noninformative if it is taken proportional 
to the square root of its Fisher's information measure, $ (©*). 
Since in the present case such a measure is given by 

t?(x) °c 1/X 2 , 

the desired prior can be taken as 

g(x) l/x . (3.3.4) 


Besides, the sampling distribution of the sum of squares, 
(vo^/x), is a Chi-square distribution with v degrees of 
freedom. Consequently, we have 


g(° 2 /x) = 


r(|) 


(° 2 )' 


V A 


eap (- *2x“)» A > 0, 


(3.3.5) 


/s 2 2 

where a is an appropriate estimate of o % y is the number 
of degrees of freedom associated with this estimate and T 


denotes the gamma function, defined in Appendix I>, Since in 
the present case a 2 = o 2 = ... = o 2 (_ cr 2 ) # the estimate i Sf 
in fact, obtained by pooling the estimates, o^'sof a ^'s, each 



77 


secured through the corresponding model. Thus 



m 

u=l 


V 


u 


V 



(3.3.6) 


m 

where y = > ^ a> u and is the number of degrees of freedom 

✓N O 

due to the estimate <7^. 

So far as the posterior, gth/y) [= g(x/6 2 )], of X is 
concerned, we may use (3.3.4) and (3.3.5) in the relation 


g(x/o 2 ) oc g(o 2 /x) g(X). 

Keeping in view the fact that for a given sample of n observations 
the quantity [ { (y/ 2) V/ ^ 2 / r (^/ 2) } (6 2 |^/ 2T^] , is a fixed constant, 
we can write 

■<i ♦ i) 


g(Vy) = c x 


exp {-(—-) x” 1 }. 


-li 


where 


1 00 “(”R + l) ,,.*2 ■> 

C = / X exp {-(•—-) x" 1 } dX 


Using the inverse gamma function (See Appendix D), we get 

c -i = Tty. 

Thus the posterior p.d.f . of X finally reduces to the form 


g(x/y) = 


&r) v / 2 + 

US) x 


2 


exp {- (-^p) X*" 1 }, X > 0 


(3.3.7) 


78 


We now utilize (3.3.1) and (3.3.7) in the formula (3.3.3) and 
obtain 


f (u) 

n+1 


(y n+l/y J = 


,vc 2 .v/ 2 
^ 2 ' 


[{271(1+^)} 1/2 


rb/2)]“ 1 x 


- + l) 

/ X 2 



(y^ ~yn+i )2 


n+1 

TT 


} x -1 ] dx 


+ 2. 


U' 


# 


(3.3.8) 


The integral on the right hand side of (3*3.8) can be 
solved by the inverse gamma function (See Appendix D) so that 


= 


J v/2 |. r(i/2){2(i+ 2u )} :1 ^ 2 r(i</ 2 )] 1 X 

[1 { v<^ t i!n 1 T©. 2 !) -(»«)/2 r (iii) 

1*^2 


u 


[ J 1 o 2 ( 1+ z u )]- 1 /2 tl 

«) u 


r$>) r(-^ 


■(1 + Zu> 


The p.d.f, of W given n observations, y, is thus given by 

( a (u) ^ 

f n+l^ y n+l/ y ) =t B ( v /2»l/2) ^ 1//2 « u l“ 1 [l + 1 ]~ +1 V 2 < 




u 


(3.3.9) 



79 


where 

w u = a (l + z u > 1 / 2 (3.3.10) 

and B(p,q) stands for the Beta function, given in D.2 of 
Appendix D. Similarly, thep.d.f. of Y n+1 , under model v, in 
the present set of assumptions is given by 


i£(y n+ 1 /y) = [b(V2,i/2) * x/z Sj _1 Ci + ^ ZJ^l r (v+1)/z , 

(3.3.11) 


no 


where 

w v = o(l + z v ) 1//2 . (3.3.12) 

The densities specified in (3.3.9) and -(3.3*11 ) f when used in 
the relation 


4li<Wz» V2 dy n+ i - 


2 n+1* n+1' LJ -n+l 
R 1 


give 


h 2 (f n+l> f nll> - [B OV 2,1/2) vVZ ( a 2 &2)l/4 r l x 


u V' 


t - ( v+l )/4 t - ( v+1 )/4 - 

/.( 1 +-^> (1 +-#) dy n + l> 

(3.3.13) 


R 


where 


“ t ( y n+1 - yn+l ) 2 /“i ] ' 1 = u > v ' 


(3.3.14) 


The integrand in (3.3.13), however, needs to be reduced to a 
tractable form for integration. To that purpose we write 



80 


t — (y +1 ) /4 t 

Cl + -~) .= exp {- ^ log e (l +^~)} (3.3.15) 

If we use the expansion of log a (l + ty* -1 ), in terms of 

6 u. 7 

**1 1 
(t v ), in the component [{(v+l)/4} log (l + t v* 1 )] and 

arrange the terms in the ascending order of the # powers of v*" 1 

we can express this component as 

<r^) log e (l + -^) t u - £; (3.3.16) 

where is related to t u through the expression, 

T i = -z i r iln { (1+1) *u • 1 t u +1 > • (3.3.17) 

(Refer D.6 of Appendix D for details of the algebra used). 

Using ( 3 . 3 . 16 ), the equation ( 3 . 3 . 15 ) can thus be reformed 
as 

t -(v+l )/4 .°2» 

(l + -7) = exp (- 4 t ) exp ( >_ t . v** 1 ) 

v * u i=l 1 

= exp (- t u ) C TT exp (i^ 1 )] . 

( 3 . 3 . 18 ) 

We can further simplify the right hand side of this equation 
by using the expansions of the factors, exp ( ), In fact, 
when the coefficients of the like powers of are collected 
in the infinite product of these expansions, we get 

-fr exp (t v -1 ) = Z a,* -1 , (3.3.19) 

i=l 1 i=0 x 


81 


where 


a o = 1. 


®1 - \ • 
a 2 = 2 + X 2 * 


a, = — 1 


_ T 1 


r + t 1 t 2 + t 3 » 

4 


t 2 t t 2 

a, - l + 1 2 . *2 , 

a 4~^4 + 2 + T T l T 3 + t 4 » 


T ! 

a,- = ~± 


^5 

l l T ±2 T 

5 =1S5 + ~T g + ~T S + ~r 2 +t 1 t 4 + Vj + V 


T - T . 2 




T 1 T i% T t T / th 2 t3t t 3 
a 6 - 720 + “27T + “2~ + + + t-jT^ 


T - 'tf'X 


T/ + T T 

14 2 3 

r3 


-?T + t 1 t 5 + T 2 T 4 + T 6 * 


etc. 


Further, by utilizing the relation (3.3.17), between t ± and t 
in the above equations we can express each coefficient of the 
series in (3.3.19) as a polynomial in t . Let A. (t ) stand 

X UL. 

for the polynomial corresponding to a^, Then 

i+1 

A. (t ) = y mm a +2i-p+l , 

1 U Ai a;L P u 9 (3.3.20 


P= 

Ahere a ip *s are given in Table 3.3.1. 



r-4 

i 

ill 

10 •*-? 

CM CD 



T—f 

I 

T-i 


f 

Lii 

I 

Lii 


CK 

fO 


o- 

O* 


CD 

!\ 


♦ 

T 


O 

i 

o 

! 


T-i 

T-i 

t 


i 

Lii 

i 

Lii 

C-4 

<5 

D- 

in 

m 

N 

CN 

CD 

T-i 

♦ 

♦ 

♦ 

o 

/N 

*s_«* 

o 

tt— ! 
s 

-r-i 

J 

C-4 

I 

i 

UJ 

ui 

1 

UJ 

KJ 

CD 

N 

fD 


fO 

CN 

T-i 

K 

«•* 

•*• 

♦" 

O 

i 

0 

1 

0 

1 

T-i 

Ci 

I 

<r 

i 

f 

Lii 

lii 

i 

Ui 

ID 


CD 

t-{ 

1 — ! 

w 

o 

T-i 

re 

■*» 

♦ 

♦ 

o 

o 

o 

re 

1 

<r 

i 

S3 

i 

1 

tli 

lii 

i 

Hi 

C-4 

rs 

T-i 


o 

t-5 

N 

C-4 

O- 

♦ 

-#• 

* 

O 

i 

O 

? 

o 

<r 

t 

>o 

1 

cp 

lLi 

i 

lii 

: 

Lii 

c-i 

o 

O 

o 

'w' 

re 

-r-i 

K* 

re 

♦ 

T 

♦ 

o 

o 

o 


H C*4 fO 


stands for .x:L() 



83 


A corresponding change in (3.3.19) gives 

°° . mSm*, 

TT exp (t.iT 1 ) = >_ A (t ) v” 1 . (3.3.21) 

i=l 1 i=0 1 u 

Accordingly, when we use (3.3.21) in (3.3.18) we get 

t -(^+l)/4 oo 

(! +_Ji) = e xp (_ 1 t ^) [>_ A i (t u )v“ i ] . (3.3.22) 

Proceeding on the same lines, we can express the second factor 
of the integrand in (3.3.13) as 

t -(v+l)/4 , , 

(l + ir) = exp(-4 t ) [51 B (t ) (3.3.23) 

' i=0 J 

where B^.(t v ), like A^ (t y ) , is a polynomial in t^ and is given 
hy 


B d (t v ) = g P . q t^l (3.3.24) 

with {3 . *s as given in Table 3.3.1. 

When we make use of (3.3.22) and (3.3.23), the integrand in 
(3.3.13) becomes 


t -(v+i)/U t -(v+l)/4 

(l + ^r) 


lexp {-i (t u + t v )}] £ £ O t1 (t u .t v ) V -(i + 3>, 


1=0 j =0 


where 


Q^(t, lf tJ = cc<^ An t 2i~v+l t 2j-q+l 


ij ' °u> 


fKL 


«n ip jq u 


(3.3. 25) 


(3.3.26) 


is si bivariate polynomial in t^ and obtained through the 



84 


product, [A^(t u )B^. (t v )] , of the polynomials in (3.3.2Q)and 
(3.3.24), Further, recalling the definition of t^ (i = u,v) from 
(3.3.14), we have 


< y „*, - y ^> 2 ( y ^ - » 


t +t n+ l ^n+l ^n-rl y n+l' 

U V ^ o 1 ^ o 

w u < 


(3.3.27) 


However, through the application of the results (A.b.ll) and 
(A. 1.12) [Corollary A.i, Appendix A] on the right hand side of 
(3.3.27) we can express (t u + t v ), alternatively, as 


*u + *v = 


(y — y )2 (y^ U ) «. y (v) \ 2 

- vy n+l y n+l ; . ' ,y n+l y n+l' 


~2 2 .2 

» to u + m v 

2 ~(u) ,12 ,t(v)w/-2 , - 2> 


(3.3.28) 


where y n+1 . y™ + ^ ) /Cjf and 

tt to * T /o 

~ t u v \l/2 

w " ( T2""7“ } 
co u + «v 

Substituting (3.3.28) in (3.3.25), we get 


(3.3.29) 


(1 + ^)- (v+1)/4 a + >)' fo+1)/4 = 


«P [- [f £ )] V _(1+3) 


oo oo 


-2 . -2 J J 
w u + w v 

taJi. - y„4,)' 


i=0 j=0 


10 ' u T ’v' 


exp [-4 . 


(3.3.30) 


When the expression, from (3.3.30), for the integrand is 
introduced in (3.3*13) In, reduces to the form 



85 


h 2 (f 


(u) 

n+1* 


f 


(v)s 

n+l ; 


[ B(»/2,l/2) * 1/2 fa 2 o^) 1 / 4 ]-! e xp 


t-J {- 


(J("l - 

vy n+l 


' (v)^2 
n+l‘ 


6) + CO* 

U 1 


•}] 


/ 

R 3 


n 

i=0 o 




Q iO (t u 


•tj v- (i+ 3> exp[- -1 


2u 


•>]dy n+ i . 

(3.3.31) 


Now, the integral in (3.3.31) can be evaluated term by term. 

In this connection, the integrand involved therein suggests 
that if Y n+ ^ is considered to be distributed as N_£y n+ ^,2w^), 
then each integral, involving the bivariate polynomial Q. . 
would yield a polynomial of the mixed moments of t u and t y , of 
the type jUg^ = E(t^t^.). Using these facts in (3. 3. 31) we get 


= [ B6-/2,l/2)// 2 e^2)lA]-l [^(2^2)]; V2 x 


” id -fi+il 

f» S hi h Voi W 21-p + l,2a-q + l V 


0=U p«JL qfcj 

1 r^n+l " ^n+1 ^ 2 


exp [- \ {■ 


^2 ^2 

CO + CO' 
U V 


■}]. 


Finally, when we use the expression for m from (3.3.29) and 
rearrange the terms we are able to express h^ as 


/ a 2 a 2 
4or a) 


- &h z 




}], 


(3.3.32) 



86 


where 

c* = [at/ bCv/ 2,1/2)] 1 ^ x 

_22. -22. it 1 OiL -(i+i + -i) 

> £ a. (3 hpi.-n+n ?i_n+l v 2 

i3) j=0 p=l q=Q_ 09. 2i p+l,2j~cj+l • 

(3.3.33) 

Substitution of h £ from (3.3.32) for h in the function 

■ [1 - 

would give the required expression for K (t n ) and hence 
the criterion function 0 (g ) for the case when the common 

'Xtl+l 

variance of error, o is unknown and the information available 
from the previous n observations is utilized. 

The mixed moments, ju a involved in (3.3.33) are difficult 
to evaluate. However, if we make use of the inversion formulae 
of Cook (1951) , given in D,5 of Appendix D, then we can find 
at least some of these, conveniently, through the joint 
cumulants k Q ^. Fortunately, the number of these formulae 
happens to be adequate for our purpose as will be argued latter, 
So far as the evaluation of is concerned, the fact that 
Y n+1 may ke considered to be distributed as ^(y n+ p»^) enables 
us to use Lemma C.l of Appendix C and get the following 
expressions for the required cumulants 

\ 0 = 2 a-1 (a-l)| (P/P) 51 [1 + a > 

fr 0b = 2 b_ 1 (b-l) I (p,/p) b [1 + b Pd^] 





87 


ka b = 2 a+1> ‘ 1 (a+b-2)! (p u /p) a (p v /p) b x 

[ (a+b-l) + a(a-l) pdj + b(b-l)p + 2ab pd^], 
where P g = (Wg)" 1 , d s = (y n+1 - y^), s = u,v and p = (p u + P v )/ 2. 
Remarks » 

(i) A close look at the expression in (3.3.32) brings out 
the important point that the function h^ and hence 

K u takes into account the dissimilarity which 

a new observation is likely to introduce, in terms of 
the expectations and orientations of the underlying 
probability distributions associated with models u and v. 

(ii) The equation (3.3.32) also points out that, except for 
the factor C*, the expression for hg is similar to that 

r. 

for h^ of Case 1, where error variance has been 

2 

assumed to be known* a is now replaced by its sample 
~2 

estimate, o , 

(iii) It may be noted that the inversion formulae, through 
which the evaluation of the mixed moments is proposed 
to be done, are given for the order (a,b), satisfying 
(a+b) < 6 . The doubt about the inadequacy of these 

formulae arises, naturally, when one finds that the 

mixed moments of orders (a,b), with (a+b) > 6 are also 

involved in the series in (3.3.33). This need not, however, be 

*.(i + "l I I]? ) 

a niatter of worry. The involvement of the t erm v ° 2 7 

and the constants a. , P i , which decrease rapidly as 



88 


the series expand from one term to the next, assures 
that by the time the series would expand upto the term 
involving such a^moment, the magnitude of the factor 
^ a iP ^jq v ^ + ^ + 2 ^would have already gone so low that 
the contribution of this term would be negligible, The 
same is true of the subsequent terms in the series. The 
proposed formulae are, therefore, adequate for the present 
purpose. 


3.4 CASE 3 j UNKNOW, HETEROGENEOUS VARIANCES OF ERRORS 

We finally consider the situation in which the heterogeneity 
2 2 2 

of variances, 0 q»° 2 » • ** » a m > may seem to be a more reasonable 
assumption. Thus, in addition to the basic assumptions B(i) 
through B(v) we now assume thac 

C3(i) ^ °v* u • v = 1 * 2 ,,.. ,mj u £ v , 

C 3 (ii) is unknown, U= 1,2, . . .,m . 

With this set-up we derive an expression for "to be 

used in the criterion function (2.4,5). Let Y n+ ^ be the random 
variable on which a discriminatory observation, y n+ i», is yet 
to be realized through a suitable setting of the input 
In the present state of ignorance we need the p.d.f. of W 
given n observations only. However, we start with the basic 
assumptions B(i) through B(v) only, ignoring C 3 (i) and C 3 (ii) 
for the time being. In that case, from the derivations in 
Section 3.2 we know that, given a 2 (= X u (say)) ard n observations. 



89 


y, the p.d.f. of Y n+ ^, under model u,can be expressed as [Refer 
equation (3.2.17)1 


( (u)\2 

f n+l (y n+l /? 'u ) ” ^u (l+z u )] ” 1/2 ex P [“ \ t 

(3.4.1) 

/\ (u) 

where y n+ -^ is an estimated value of Y n+ ^, using model u, z u is 
given through the relation (3.2.14) and the partial derivatives 
involved therein are given by (3.2.4). 

Nevertheless, in the present case, according to the' 
assumption C3(ii), c^is unknown. We, therefore, resort to the 
use of a noninformative prior for x u , i.e.. 


gGO * — . 

A A 

Besides, we know that the sampling distribution of (v u °^/a u ) 
is given by 

*u 

V, 




u 


< S u> 


2 \ 2 


“-1 


^2 

exp (- \ u > 0, 


2 *u 


(3.4.2) 


where 0^ is an appropriate estimate of 0^, obtained through 
model u alone and v n is the number of degrees of freedom 
attached with this estimate. After going through the same 
algebra as was used in proceeding from (3.3.l) to (3.3.9) in 
Section 3.3 (with 0 2 and 0 2 replaced by ° 2 and o 2 , respectively) 
the required p.d.f. of Y n+ ^ under model u, can be found to be of 
the form 



90 


f n+l (y n+l /y) " Cb(v./ 2,1/2) sJ'V 


tl , (y n + l - 7^1 
% *u 

(3.4.3) 

where 


s u = % (1 + z u )1/2 • 

(3.4.4) 


It may be noted that the only prior information used in 
(3.4.3) comes from the previous n observations-. Similarly, 
under model v(v £ u), the p.d.f. of Y n+1 is given by 

f nli ( W y) = [B^V 2 ’ 1 / 25 ’'v 72 ^' 1 X 

r tyi+l ” y n+l^ 2 i“ 

[1 + — sa - nfl. ,.] V t (3.4.5) 

v l d 

V V 

with 

K = S v (l + z v )1/2 - (3.4.6) 


In order to obtain an expression for measuring the dissimilarity 
between models u and v, the two densities f^^ and 'thus 

obtained are to be further utilized in the relation 


h,(f^ u ) f( v b = r 
3 n+1* n+1 


R 


[f (u) (v 
LI n+l ky - 


n+l‘ 


/y) 


f (v 4y 

n+l vy ’ 


n+1' 


/y)] 1/2 


dy n+l* 


(3.4.7) 


In fact, as a result of substitution in (3.4.7) from (3.4.3) 
and (3.4.5) we get 



91 


h = 

3 n+1 n+1 

[B( Vu /2,l/2) B(vy 2 ,l/ 2 ) ^/2 lA/2]-l/2 (j2 

r a + V <v " +1)/4 (i ♦ V ( V 1)74 , 

J- a (1 *—■ > d y n+ i’ 


R 


u 


v 


(3.4.8) 


where 


*1 " 


(y — y^))^ 

__n+l y n+l' 


£ . 
x 


1 = u,v 


(3.4.9) 


The integral in (3.4.8) is again intractable, However, it is 

of the same form as in (3.3.13) and can, therefore, be dealt 

«** 

with similarly* Proceeding on the same lines we can express 
each factor of the integrand in an alternative form. Thus, 
utilizing (3.3.22), the first factor can be written as 


t -(v +l)/4 , 

a + _£) u ' — ' 1 

u 


= t u ) i_ A ± (t u ) v- , 


where (t u ) is a polynomial in t and is given by 

•rs/ i+1 

A. (t ) = > a. t 2l ” p 1 . 

i v u 7 f-n ip u 
p=l 

Similarly, the second factor can be expressed as 


(3.4.10) 


(3.4.11) 


a * V (v-1>A 


v. 


= exp 


(- t v ) b (t v ) v;2. 


0=0 


(3.4.12) 



92 


where 


b . (t ) = y p t 2 j“9 +1 . 

0 v il ^ u 


(3,4.13) 


The coefficients a^p’s and p^'s appearing in (3.4.11) and 
(3.4.13) are given in Table 3.3.1. Multiplying the two factors 
from (3.4.10) and (3.4.12), we further have 

„ . ? -(„ +i)/4 


(1 -5 s ) 
u 


(1 +J) v 

v 


exp {- ^ ct u + t v )} g ^ “ijlVV . (3.4.14) 

where Cb ^ represents a bivariate polynomial in t u and t v , 
obtained by multiplying A ± (t u ) and B^ (t y ) from (3.4.11) and 
(3.4.13), respectively, and is given by 


VW -i- 


i : 1 i : 1 «,„ p,„ 


£1 q5l ip jq u 


(3.4.15) 


fW n*J 

Now, using (3.4.9) the sum (t u + t y ) can be written 

Cv - -»C( U ) \ 2 / _ ^(v)x2 

t + t = ^n+1 y n+l + y n+I y n+l 


u v 


But, through (A.l.ll) and (A.1.12) of Corollary A.1 
(Appendix A)we can, alternatively, express this sum as 


xr* 'S 2 M( U ) C,(v)\2 
t + + (y n+l ~ y n+l Cy n+ 1 ~ y n+l j 

u v - ~2 r f 2 + £ 2 

5 *u v 


(3.4.16) 



93 


with v* =s (y ^ v , v 2 a (v) \ //* 2. ^ 2\ 

wrcn *£+1 - «v y n+l + 5 u yn+p/(!u + * 2) 


and 


£ 2 ? 2 

y = {_iuV •>!/? 


(3*4.17) 


The expression for (? u ♦ t,) from ( 3 . 4 . 16 ) when used in (3.4.14) 
further gives 


(1 


W 1 )/ 4 t -(j, +i)/4 


V m 


u 


(1 + J) v 


V, 


V 


(~(u) - r,(v)x2 
«^v* r_ l / ^n+i ^n+i J ^ t 

exp L ^ * ?g : c2 — w * 


5 + 1‘ 

*U v 


oo oo 


[2 2 Qj, . (t f t ) < w"‘3l pvp r 1 / ^n+l ^n+l^ it 

±T 0 3r 0 ij V *u v J exp L l SjT }J ' 

(3.4.18) 

Equation (3*4.18) presents an alternative expression for the 
integrand in (3.4.8). Subsequently, its use in (3.4.8) gives 

h(f ( u) f (v)) = 

3 n-f l f n+l y 

[B(vy2,l/2) B(v v /2,1/ 2 ) i»V 2 „l/2 (? 2 x 

(v(«) - 4 (v)n2 

exp [- HJng ^ W „ x 


1 r (y n + l - yS+l> 2 


OO oo 


A 2 

f ~ T" ° (* ,t ) ..-1 ..-3 -._r lf' y n+l - y n+l^ n . 
r 1 fe) « W ’u V v ex ?l- - 2 i -p }] dy, 


n+1 • 
(3.4.19) 



94 


We notice that the integral in (3.4.19) is of the same form 

as the integral in (3.3.31). It can, therefore, be. likewise 

evaluated term by term. Thus, when Y in this case is 

n+i 

considered to be distributed as N£y* +1 , 2g 2 ), as is suggested 
by the integrand in (3.4.19), and £ a b is used to denote 
E(t u t“), the same argument enables us to write 


h C-p(u) f (v)> _ 

h 3 u n+.l» 1 n+l' = 



Finally, substituting for £ from (3.4.17) and rearranging the 


terms we get 
3 n+l tX n+l J 


AO AO 

c i 


u 


+ 

Y 


] 1//Zf exp [- i {■ 


.(u) 

2+1 

75T 


^n+l "”_Zn±i ^ ^ 


(v) 

an 
2> 


<?* + V) 

*u v 


(3.4.20) 


where 

a* = [( 27 tr 1 B(vJ 2 , 1 / 2 ) B(v v /2,1/2)]“ 1//2 

^ * ii 1 it 1 Q ~ -(i +-J) -(a + ■$) 

^ a ip ^jq ^i-p+l^i-i+t u v 

‘V T-/ r ! ' (3.4.21) 



95 


The required expression for K , (e . ) to be used in the 
criterion function (2.4.5) for the present case would be the 
one obtained through the function [l - ^(f^jf^)] 1 / 2 . 

As in the previous case, here, too, the mixed moments, 
d a b* can ^ oun<i through the joint cumulants (say) by 
means of the inversion formulae of Cook (l95l), listed in D.5 
of Appendix D. So far as l^'s are concerned we use the same 
argument and from Lemma C.l of Appendix C write the expression 
for the (a,b)th cumulant of t and t y as 

K ab = 2 a+b ” 1 (a+b-2) ! [(a+b-l) + a(a-l) p d 2 + b(b-l) p d 2 

^ U V 

+ 2 ab p a d^ (py p) a (pyp) 1 ", (3.4.22) 

where p g = (Ef) -1 , d g = y* +1 - s = u,v, and 

P=C^ + P y )/2. 

We close this section with a few remarks . 

Remarks s 

(i) So far as the divergence of the probability distributions 
for designing a new experiment is concerned, h^ and hence 
K u vjn+ 2 considers dissimilarity of these distributions, 
in terms of the expectations and orientations, due to the 
observation being awaitted. 

(ii) It may be noted that, except for the factor , the 
form of h^ is similar to that for h 2 of the previous case. 



96 


However, the estimate of the unknown is no longer 
a pooled estimate, 

(iii) Lastly, the inversion formulae, which seem to be 
falling short in this case, too, may similarly be 
confirmed to be adequate. This time, it is the factor 

, -a+4> 

Lafp J which plays the role and 

renders the contribution of the moments ju a ^,(a+b)> 6 
insignificant. 



CHAPTER 4 

DESIGN OF EXPERIMENTS FOR MODEL DISCRIMINATION 
IN MULTIRESPONSE SYSTEMS 


4.1 BASIC ASSUMPTIONS 

There are many fields in which the interest of the 
investigator often centres around two or more responses rather 
than a single response. Since there lies an advantage in 
considering all the responses together, it is important to 
formulate ways for designing experiments for model discrimination 
in a multiresponse system. In this chapter we plan to develop 
the design criteria under different types of assumptions which 
an investigator might think reasonable in a given situation. 

In Chapter 2, we proposed that if the aim is to decide 
on an optimal experimental setting, the objective function 

m-1 jo, 

= >_ >_ w „ _ K (t -, ) 

r ~n+i v=u+l u » v » n u » v ~ n+1 

may be maximized with respect to over the operability region. 

While the weights in this function have already been proposed to 

be obtained through the values of the discrimination index from 

the nth stage, the specification of the relevant form of 

K (g ), under different types of conditions in a multiresponse 

U.| V AjIiTl 

system, is still pending. In this chapter we subject the 



98 


distance function, K UfV 0»n+l^* different sets of assumptions. 
However, in order to develop the distribution theory necessary 
for deriving the particular forms, certain basic assumptions 
are required to be made. 


We consider an i>-response (r J> 2) system and suppose that 
there are m r-equation models, . . . ,M^ m \ which are 

being discriminated. In model u, i.e., in the set of equations 


±k 


V 


(u) 




0 (u) ) 


+ e 


(u) 

ik 


X 22 1}2| * • • 


(4.1.1) 


we assume that 


B + (i) E< u ><e<£>) = 0 , 

B + (ii) = 0 , 


B + (iv) the error vector = ( e i c i* e k2' * * * ,e kr^ c ^ s ^ r ^ u '* ;e< ^ 
as N r (0,Z u ), where ^ 

k, & — 1,2,..., n, . • § / kf l , Q — 1,2,. .«,r , 

B + (v) the model u, if nonlinear, can be approximated by a 
linear form in the parameter spacef 
the assumptions B + (i) through B + (v) being applicable to all the 
m models. 


4.2 CASE i; KNOWN, EQUAL COVARIANCE MATRICES 

In a given situation, with all the above assumptions holding 
good, it may further be possible to assume that 



C (ii) "the error covariance matrix E is known in a*dvanc e. 

We now seek the design criterion function when the models 
are nonlinear; the ones linear in parameters do not require 
any alteration in the results thus obtained. Let Y be 
the random vector on which a discriminatory observation is 
being sought for. We first evaluate the p.d.f. of under 

model u when the common covariance matrix ,Z, is given. With the 
basic assumptions holding good and the expectation (^n+q) 


, (u) ■ 


[= as well a s £ being known, the p.d.f. of is given 

by 

f(U)( Jn + l / ^l> Z) “ 

[(2«) r |i|]- 1/2 «p£- • 


(4.2,1) 


(u) . 


However, at the nth stage the expectation is not known. 

It is, therefore, the p.d.f, of w given I, which is 
required in the present set-up. In fact, the p.d.f, of 
^ + 2 » £^+l (say), in this case can be obtained through the 
relation 

W» ■ £* (tt) 

(4.2.2) 

The first factor in the integrand in (4.2.2) is given in (4.2.1) 

In the following we seek an expression for the second factor * 
Now, under mondel u, we have 


100 


The assumption B + (v) permits th e 

M , 6 


nonlinear response function r )\ ' ^ 

Therefore, if we decide to linearis ^ 

-(u) (u) he --- 

m.l.e., 9 ,019 , then from ^ 

B + (i) we can write 


linearization of the 


parameter space, R (say), 

0 


.(u) 


E-(Y lk )=^) (ik >)) + („>• (uJ , (u) 


where 
.(u) 


; ik (ar* - <r o, 

1 , 2 , ... ,r ( 4 . 2 . 4 ) 


£ik 


/ v (u) (u) (u) y 

vx ikl» x ik 2 ' *** ,x ikp u ; , i = 


1 » 2 , .. . ,r 


and x. 


(u) r a,, l u) ^ (u) ) 

ikt = *• ^qCu) ® U ^ u ) * 


(4.2.5) 


(4.2.6) 


t a 1 p 

. . . 

The relation, (4.2.4) can also be written 


»P. 


u 


E<u) (*k> - u (u) <ik>® (u)) + ^ u) ( 8 w . s(u>) 

where is an r- vector and 


v-(u) ( ku-j \u ) \un 

\ v £lk » ~2k * • ' * >x rk } 


(u) (u) 


| 


(4.2.7) 


(4.2.8) 


,(u) 


^ is an r ^matrix of part lal teivatlves> x (u) f givec by 
(4.2.6). In particular, from ( 4 , 2 , 7 ) we have 


i • 0 * , 




(4.2.9) 


(4.2.10) 


v/ here 


Xn+l “ 2 l En+l»S } * 


(4.2.11) 


This suggests that the distribution of (jx^^ - Y^) is 
identical with that of (©^ ” g/ u ^). 111 ^' lie following 

we adopt the Bayesian approach and seek the posterior 
distribution of (©^ - 0^). 

/S^ A/ 

How, the likelihood function of n sets j>f. data,.--g, .under 
model u, is given by 


L( ® (u) /x« E) - C(2^ r izir n/2 exp [--|£ £ £ ] , 


-JB r r 

2 

k=l j=l 0=1 

(4.2.12) 

where y, an nxr matrix, represents the whole set of data, 

rv/ 

/V 

C i5 denotes the (ij)th element of the inverse of £ u , and 


- k lk - £ u) ^.e (u) )}{y, k - ^ (u) <i k .! (u) » • 


If we let e^ } *= {y ik - T){ U) (^ k »^ Uj ))» 1 = 1,2, ... ,r, then using 

the linear form (4.2.4) of the response function we can 

write 

v (u) = (e (u) - x (u), (9 Cu) - e (u) )l {e<"> - <e (u) - e (u) )} 

v ijk * e ik £ik ~ ~ 1 0 k ~0^ ~ ~ 


(u) 


Further, using the feet that 0^ U ^ is 311 m.l.e. of ©^ \ we have 
Sjk^ ** S* Consequently , 



JL V/<£. 


The likelihood function L in (4.2.12) thus becomes 


L <a <u) /x>s> = tteo'mr ®'' 2 exp [-4 f. £ £ a« >> e ^ } ] 

~ d k=l i=l ,1=1 u lK 

M u 


£L -X -E 

D : 

[exp {- ^ (e^ - M„ (e^ - 0^)}3» 

(4.2.13) 


w here 


££ E^*^' , 


M- 


■u 


k=l i=l 3=1 


Ik ~jk 


(4.2.14) 


Now, if we assume a locally uniform prior distribution for 9 
then the posterior probability density of is given by 


(u) 


g(e^ u Vy» 2: ) = 


L(e (u) /y,s) 

/V A-/ 


I L(e (u) /y,s) de (u) 

rv g rvw» ^ , /s/ 


(4.2.15) 


Using, therefore, the likelihood (4,2.13) in (4.2.15) we get 

g(e (u) /y,D - [(»)* Pu |M u l] V2 e X p[--|{(8 (u) -|( u )) , M u (W"))}], 

(4.2.16) 

where M u is given by (4.2.14), 

This shows that the posterior probability distribution of 
(e (u) - e (u) ) is a p -variate normal istribution with mean 

ai rv U 

vector 0 and covariance matrix fT 1 . Further, the r - vector 

fsr LI 

xK> (e (u) - e (u) ) being a linear combination of the normal 

n+1 '/v ry 

variates is an n-variate normal random vector, distributed about 

0 with covariance matrix 

K ip* *. (say)). Thus the 


u 



relation (4.2.10) leads us to the conclusion that the vector 

/N 

^~n+l £n+l^ has an v -' variate normal distribution ♦ N^(0,C^) , 

Accordingly, the second factor of the integrand in (4.2.2) can 
be written as 

g (u) G^/i) = 


[(2t0 : 


'ICJI 


“ 1//2 exp [- 


£n+l “ Xn+1 


(u) 


)’ 


CT 1 

u 


(h ( u > - y (u ht] 

Oi+l tn+l' JJ 


(4.2,17) 

Substituting for f^ and g^ in (4.2.2) from (4.2.1) and 
(4.2.17), respectively, we get 


c J>- 1/2 / r «p(-i 

R r 

(4.2,18) 

where 

W&b = {( ^i - W 2-1 " &n )} + 

v&l - &l>' - A ■ <*•*•«> 

Now, using equations (A.l.l) through (A.1.3) of Lemma A.l 
(Appendix A) ,the function 0^^) can alternatively be expressed 


= & - &»i> * + <hSS 

- ttb’ + c u )_1 ( xn + i 


(u) 


,-l 


(u) 


fel+l^ + 


{(y 


tn+l in+l J 


Xu) 

'n+1' 


y»i ; ) } » 


(4.2.20) 


as 


where is given by 


jul _ = (Z" 1 + CT 1 )” 1 (Z" 1 y _ + C T 1 y^h . (4.2.21) 

£n+l v u ' v 4n+l u in+1' 


(uK ,, 


Making use of this form of in (4.2.18) we get 


^iw z) 


[{(2 U)~ r |z -1 | IcT 1 !} 1 / 2 ] X 


exp [- ii^n+iL " ^+i>* (Z + C u ) 1 ( Xn+l " Zn+1^ x 


,(u) 


(2 %T t/2 f exp [- -|{({4+i “ fti+1^ 


(u) 

h+1 


&i+i^ d ifn+l» 


R 


i • g # f 


f^ u ^(y -,/e) 

im vy n+l' 


x 


= [(si.)' 1- ir 1 1 ic; 1 ! ir 1 + <r 1 |] 1 '' 2 

exp [- " Xn+1^ (S *U> h;?n+l “ Xn+l^ J ' 

(4,2.22) 


Since Z” 1 and CT 1 exist we can use the identity, 

S ~ 1 + C^ 1 = z 1 (z + C u ) cr 1 

(4.2.22) and conclude that given Z and n sets of observations 

,, v +-u p r, d. f. of Y .-under model u is given by 

»y of . . . .y^, vne p.u.x. u.x ^ n+ ]_ 


in 


UL'iZ 


w s) = [( ^ )r m _1/2 * 


=xp[-i{ (y„ +1 - xlil) ^ 1( 5n+l " X “ +1> (1.2.23) 

■ : ■: ■ ' - 


105 


where 




= 2 + C 


u 


(4.2.24) 


Similarly, given Z and n sets of observations , the p.d.f. of Y n+1 
under model v would be 


i z J r 1/a x 


exp - ^)X 1 (y 


in+1 


with 

' "v 

Consider now the function 


= Z + C 


Xn+l ; JJ 

(4.2.25) 

(4.2.26) 


h,(f (u) f (v) ) - f (f (u) f (v) ) 1/2 dv 
n l VI n+l* 1 n+l ; J _ u n+l 1 n+l ; 'Sn+l 

IT 


(4.2.27) 


Substituting for f^^ and f^| in (4.2.27) from (4.2.23) and 
(4.2.25) we get 

h l ( 4*l' £ nTx ) - [^) 2r 1^1 Kir l/U f ex P<-i VW^n+l , 


where 


(4.2.28) 


Q 2^Xn+l^ = ^Xn+1 - Xn+1^ Z u ^r+1 “ 7n+l^ + 

y&bVfrn- -yJ3» 


'Xn+1 ** Xn+1' u Sui+1 ~ tn+l‘ 

(y) 


'Xn+l “ in+1 ' v '■jcnt-x xn+l' 

Using (A.l.l) through (A. 1.3) of Lemma A.l (Appendix A) we can 
write CL^y^) as e 


106 


Q 2^Xn+l^ ~^2n+l ” Xn+1^ Z ^Xn+1 " Xn+1^ + 


where 


zS+i ■ + V 1 <Zv &+1 + z u sili> «* 


(u) 


.(v) 


* = Z u (Zu + V T v • 


(4. 2. 29) 


The expression for Q 2^n+l^ ,when used ^ (^*2. 28), gives 


h (f^ f^) = 
n l u n+l ,I n+l ; 

[( 27 t) 2r |Zj |Z v |]“ 1 / 4 x 

exp [- - Xn+i)' + V 1 ^Xn+1 “ Xn+1^ X 

/ exp [- t- i(Y ^ +1 “ yj +i ) z *<y n +l ~ Xn+1^ ^n+1 
R r 

-tlzj IzJMz?! 2 ]- 1 / 4 x 

«p c- i - x£l>' (^ + V 1 <& l - 'i£i>» • 


Substituting for Z* from (4.2.29) and rearranging the terms 
we get 


h (f^ f( v h 

l VI n+l» I n+l ; 


IZu 


1/4 e*> [- *{«£> -ill)' (W 1 ^ - Znll>}] 

+ Z__| 



(4.2*30) 



107 


Finally, utilizing the expression for ^ from ( 4 . 2 . 30 ) 
in the function 

K u,v<W - C 1 - V f n"14ll» 1/2 

we can obtain the required distance function to be used in the 
criterion function 0 for designing the (n+l)th experiment, 

when n sets of observations are available, 2, the common 
covariance matrix, is known, and the models are linear or have 
been linearized appropriately. 

4.3 CASE 2: UNKNOWN, EQUAL ERROR COVARIANCE MATRICES 

The design criterion developed in the last section cannot 
be used if 2, the common covariance matrix, is not known, even 
though all the basic assumptions hold good and the requirement 
of equality of covariance matrices is met. In fact, there are 
situations in which, even if the investigator has reasons to 
believe that the errors in different models have same covariance 
structure the knowledge about the common covariance matrix is 
lacking. In this section, we, therefore, develop the design 
criterion through which these types of situations could be 
suitably handled. Thus, whereas we retain the assumption B (i) 
through B + (v) we assume as well that 

C + 2(i) 2 1 = I 2 = ... = 2 m (= 2) , 

2, the common covariance matrixes unknown. 


C 2(ii) 



kbt ^jn+i ra ndom vector on which a discriminatory 

observation is awaitted.In order to plan an exper im ent 
which would yield such a response under the given set of 
assumptions! namely, B + (i) through B + (v), C + 2(i), and C + 2(ii), 
we must, first, find the probability distribution of Y„ when 

rv/ii*rJL 

Z is unknown. With all the assumptions of the last section 
holding good, we know that, under model u, Y n is distributed 

normally with mean vector y^ + £ and covariance matrix 2^, with- 
2 ^ given by 


^ ^ ^r 1 xw']. 


(4.3.1) 


Presently, we are considering a situation in which the covariance 
matrix Z and hence Z”^ (= {o^}) is not known in advance. If 
we still plan to make an allowance for the variances and 
covariances of errors in the design of experiments we ought 
to fall back upon the sample and use their estimates so as to 
obtain an estimate of Z u . We propose to use a plug-in- estimate! 
namely. 


^ " c £ + { £. £ fi w ** 2 & )rl ^i ] * (4 - 5,2) 

“I* ^ 

where, because of the assumption C 2(i), the estimate I is 
obtained by pooling Idle estimates,^, obtained through the m 


rival models, i.e.. 



109 


m 

where v = ) is the number of degrees of freedom associated 

with the pooled estimate 2. 

It may be noted that because of this new arrangement. Y 

no longer obeys a normal probability law. As in the univariate 

case, we replace the r- variate normal distribution of Y n by 

( \ - /s ~ n+1 

an r- variate t distribution : t r where Z u is an 

estimate of given by (4,3.2) and v is the number of degrees 
of freedom associated with this estimation. Thus, given n sets 
of observations, the p.d.f. of Y n+1 , under model u, can be 
specified as 


.(u) 


(y r 


n+1 v £n+l 


/y) 


s 


r<?£) 

[ ] 

{ K-|)} r r(|) v r / 2 





-l 


(y 


£n+l Zn+1 




(4.3,4) 

Similarly, given n sets of observations, the p.d.f-. of Y n+1 
under model v, can be written as 


n rep 

^n+l^&i+n/X^ = [ rl l z v l 

4 { r(|)} r r«) ^ /2 


X 


Ci+(y. 


&n+l 


<- V" 1 - x^»' (w)/2 , 

(4.3.5) 



110 


where 



(4.3.6) 

with v degrees of freedom. 

Having decided on the alternative probability distributions 
of under models u and v with the restriction that 2, the 

common covariance matrixes unknown, we now seek the 
divergence 


h 2 (f n+l> f n+l ) “ ^ j4+l ( WZ> f nIl C Wy )]1/2 ^n-KL 


(4.3.7) 


Substituting for f£j£ and f^ in (4.3.7) from (4.3.4) and 
(4.3.5) we get 


h 2 (f n"l- f nll> = C 1 4l' lA / VW K3 - 8) 


where 


a [* 


(V± V \ 

rr^-> 


{ r(|)> r r(|) » r/2 


] , 


(4.3.9) 


Q 


5 ( y n+ i) - [i + i u v- 1 ]- (, ' +r) / 4 [i + t v v- 1 ]-^)/ 4 . 


(4.3.10) 


and 


T i ” ( Xn+l " in+l' “l 


44' 2T 1 (y^ - yi*b» 1 - u < v • 


(4.3.11) 


Ill 


The integrand Q 3 in (4.3.8) can be simplified to a 
tractable form. To that purpose we write 

[1 + T u v -l r (*H-r)/4 , exp { _ (Wrj log ^ (1 + ^ „-I )K 

(4.3.12) 

Now, if we use the expansion of log (l + T y” 1 ) in powers of 
T u v -1 on the right hand side of (4.3.12) and arrange the 


-1 


terms in powers of v , we can write 


[i + = exp (-iy exp (f t. v- 1 ), 


i=l 


where 


{(1+1)r T u " 1 T u i+1) }- 


(4.3.13) 


(4.3.14) 


T i - mrsrj 

(Refer D.6 of Appendix D for details of the algebra used). 
Further, the second factor on the right hand side of (4,3.13) 
can be written as 


exp (t. T ± v" 1 ) = IT exp (t. V 1 ), 
i«l 1 i=l 1 


so that on using the expansions of the factors exp (t ± v" 1 ) in 
the infinite product and collecting coefficients of like powers 


of v** we obtain 


0 ® 


exp (> \ v "’ i ) = 1 - a i V ~ X * 

i«l i=0 


(4.3.15) 


where 


& a 1 

o * 


®1 


’It 


■ v-;.: 


112 


a 2 ~ 2 + t 2 * 


sss '*-^-** 4* 4* Tj 

T ;4 T 2 T T 2 

o 1 , 12, 2 . 

a 4 = 25 + T~ + T + t 1 t 3 + T 4 * 


a 5 " T20 ^ “X" + ~2 + “T“* + t i t 4 + t 2 t 3 + t 5 , 

t' 2 t t 2 t 2 t3 t t3 

- - 1 > 12 i 14. 12, 13, 2 , r r r 

% “ 7S3 + ^zr* + ~2 + ~zr + “TP + T + W' 


T i^ + ^i + ±2 


+ T,T;, + T 0 T, + T« 


T + a l t 2 t 3 


_2 

V 3j 

4 * — < 4 - T T- + T T , 4 - T 

2 1 5 2 4 6 ' 


etc. 


In order to express the series on the right hand side of 
(4.3.15) in terms of T u we use the relation (4.3.14) and get 


°° . 

(2_ VV » 


where 


* /m 1 k-s m2i-k+l 

A I ( V “ ^ a iks r u 

k=*l s=l 


(4.3.16) 


(4.3.17) 


and a lks » s in (4.2.15) are given in Table 4.3.1. Hence from 
(4.3.13) we get 

[l + T u „-l]-&'+r)/4 = exp (_ 1 Iu ) >f° Ai (i u ) v - 1 . (4.3.18) 

Similarly, the second factor of the integrand d, in (4.3.8) 


can be expressed as 




Table 4.3.1 

Values of the coefficients & and £ 


113 


i 

so i 


i 

i 

i 

* 

\ 

i 

in \ 


k i 


l 
! 
I 
t 
I 


CM l 
! 
I 

1 

I 

I 


X 

1 

CD X 

1 1 

IX 

| 

S3 Ul 

1 1 

x ill <r 

M 

so ui <r ro 

S3 

Li 

Li Li 

Li 

Li LJ 

LJ LJ LJ 

Li 

iiii 

Li Li Li LU 

! 

Ill 

£0 

CD x 

X 

K X 

X CM TH 

00 

Cl tf - O tH tH 

X 

0 s 

in ix 

x. 

in <r 

X Ul X 

K 

X S3 CM K tH 


CM 

k «r 

▼Hi 

O N 

M- O K 

▼H 

Cl X O S3 O 

o 

* 

♦ w 

* 

♦ ♦ 

¥ ♦ ♦ 

* 



in 

S3 CD 


CD <T 

CO M K 

tH 

tH Ul X Ul O 

tH 

X 

1 

S3 111 

1 1 

ill 

M“ ST 

intro 


intio 

S3 

LI 

Li Li 

LJ 

Li Li 

Li LU LL 

tH 

1 1 ! 

Li Li Li 

1 

ui 

n 

k n 

IX 

o m 

“0- X X 

tH 

<r K CO in Ci 

03 


sr *h 

▼H 

CM CM 

K x in 

O 

m in m m 

K 

in 

in x 

o 

5T CM 

O S3 ID 

♦ 

O O tH o o 

yH 

m 

♦ * 

♦ 

4 * 

* * * 

o 



CM 

CM CM 


tH Cn 

CM in <r 


Cl X CO o o 

CO 

in 

1 

tn m- 
1 1 

sr 

t 

Kl 

t 

M* K 

1 V 


<T K 


1 

lj 

Li LJ 

i 

Li 

1 

Li 

I 1 

Li Ul 


! 1 

LI Li 


IN 

CD O 

▼H 

S3 w 

in o S3 

o 

CD S3 X CM O 


•H 

W tH 


O -Hi 

in ▼h k 

Ul 

CM O CM S3 O 


O 

'in 

ST 

x o 

cm in o 

o 

S3 X O O O 


# 

¥ ¥ 

* 

* <* 

.<*> * * 

♦ 




SO S3 

CM 

0 

£ 

3 

6 

0 

o 

▼H K O O O 


M* 

i 

k 

i 

k 

i 


K 




f 

kl 

I 

Li 

i 

iu 


LJ 




in 

M O 

S3 

S3 K 

<T rH K O 



m 

in ts 

o 

K S3 

oroooo 



CM 

0- o 

ON 

o o 

S3 O O O 



* 

# * 

w 

# ♦ 

* * * 

♦ 



in 

tH O 

k 

0 

0 

CM O O O 




S3 

K 

K 

K 

in o 

tH 

S3 

CD 

S3 

CM O 

o 

O 

O 

O 

tH O 

¥ 

♦ 

♦ 

¥ 

* * 

o 

o 

o 

O 

o o 


? m 

i CM 

| «H 
I * 

? o 

I 

! 

I 

I O 

i o 
f o 

I * 
I th 
I 


o o 
in o 

CM O 

© © 


th cm ui ih c k 


Mi- 5 <r fO 
ill! 

LJ Li ^i Li 

h h fo r-< o 
X K w CD CM «r o 
03000 


1 
1 

H liT H cs o O W C-K t O O O I 

1 
1 
I 
i 


h win c o o 
oo K o o o 


V | 
m i 


^CM H (MM tH CM K <r tH CM K M- Ul HNMM’in.'O -i Cl K <T Lfl S3 X I 


CM 


to 


in 


.M 

1 


E~k stands for X10 



[1 + „ exp (_ l T v ) ^ 


where 


B jjCV-^fi p 1tt r<t-t T ?‘ i+1 

D £=1 t=l v 


(4.3.19) 

(4.3.20) 


and p^ £t * s are given in Table 4.3.1. 

As a result the integrand in (4.3.8) becomes 


VW - ex p H (T u + T v» 4 £ A i;j ( Xn+1 )>'- (i+ ' i) , 


i=0 3=0 


where 


A i,1^n+1^ “ A i^ T u^ B j (V * 


(4.3.21) 


(4.3.22) 


Further, using (A.l.l) through (A.1,3) of Lemma A.1 (Appendix A) 
we can write 


u ' *v ” 1 'Cn+1 “ Xn+1^ + \ ^ fei+1 ~ Xn+1^ + 


T u + T y « { 0^ 

<V V 1 

where 


(4.3.23) 


&wl 


♦ i; 1 ) -1 (£ $> * % jW) . (4.3.24) 


Thus, finally, substituting for ($ u + T v ) from (4.3.23) in 
(4,3.21) and going back to (4,3.8) we get 


^[“p { - 1 $+1 - inlb' (K + V " 1 (y£l - y£lni[ i^r^i 


f Ei_ i_ a, .(s',.,) >' -(l+;i) 

r 1=0 3=0 ^ ~ n+± 


R' 


oxp {- ^ +1 - ^ +1 )’ (z- 1 + z^Xj 


in+1 “ Xn+1 


>}] <57, 


n+1 * 


( 4 . 3 . 25 ) 


This suggests that the integral on the right hand side of 
(4.3.25) can be evaluated term by term and that each integral 
involves a bivariate polynomial in quadratic forms T y and T Let 
be distributed as N r (y n+ i*{ (^ 1 +:T 1 )/2}~ 1 ),then 

■ 

C 2 (2it) r/,2 [|2^ Z^r^lfr 1 + Z^)/2\ X 


oo oo 


exp {-•$ (y - y ££>' (VV 1( ^1 - ^ 1^4 L f ( i» 


y -(i+d) 
i=o 3=0 ^ y > 

( 4 . 3.26 ) 


where 


6 i3 E(A id ( Xn + l )):l ■ 

1+1 3+1 k A (k+jjrs-t) . 

“ fel il fcl “ iks »2i-M,2d-m r 

( 4 . 3 . 27 ) 


with d a<b = E(T® ij). 



116 


^**1 * T 

Since \ and 2^ exist, we can use the identity, 

+ 1 \ = (Zu 1 + Zy l )”' L 

and express h 2 (f^,f^) of equation (4.3.26) as 


(f (u) f w/v 
v n+1 # ' 


(v) 

L n+1' 


<A A 


C* 2 vl- ] 1 A eXD r_ l{(4.(n) 

C 1 L j£^ + ^|2 J e3q? L t U Xn+l 


(4.3.28) 


with 



(4.3.29) 


The mixed moments ^ involved in (4,3.27) can be obtained 
indirectly through the joint comulants of the quadratic forms 
T u and T^., by means of the inversion formulae of Cook (1951 ) 

(Refer D. 5 of Appendix D). 

So far as the evaluation of the mixed cumulants, k h , 

Si9 u 

(a * 2i-k+l , b m 2 j- fc+1 ) is concerned, we use the fact from 
(4.3,25) that can be considered to be distributed according 

to N r (f n+1 , 2), with ZT 1 = (Z” 1 + Z^ 1 )/2, so that from equation 

(C.2.12.) of Lemma G.l (Appendix Q) we have the (a,b)th cumulant 
of T u and T v given by 



117 


k h = 2 a+ k""l ( a+ b- 2 ) I { (a+b-l) tr (RfR^) + 
a f u U v 

(a 5u - b 3v>' 2 rV (a^ ♦ b^) - 

a )4u ^ Vv &a ” b ^ Vv * (4.3*30) 

Where <i ± - (y„ +1 - y^) and R. . Z” 1 Z ± , i = u,v. 


In connection with the conversion of u a b to ^ b , it may be 
noted that the formulae proposed to be used for this purpose 
are given only for the orders (a,b), such that a+b < 6. Since 
( v - (i + 3 ) ) and the constants a^ g and decrease so rapidly 
with the expansion that by the time (a+b) exceeds 6, the low 
magnitude of the factor (a iks would make the 

contribution from 6^ negligible. Thus shows that the inversion 
formulae are sufficient in number for our purpose. 


Finally, we point out certain important features of the 
function hg as given in equation (4.3.28K It may be noticed 
that, except for the factor C*, h 2 , like its univariate analogue, 
is of the same form as in the case of known 2. Besides, even 
when the common covariance matrix, 2, is replaced by its 
sample estimate the criterion function still takes into account 
the dissimilarity of the alternative probability distributions 
of Y _ with respect to location and orientation which is likely 
to be introduced in the probability distributions of Y n+1 due 
to the (n+l)th observation* ; J: ■ S- ■ 



118 


4.4 CASE 3T UNKNOWN, UNEQUAL ERROR COVARIANCE MATRICES 

There is yet another situation in which it might be 
reasonable to assume that error considered, in rival models have 
different covariance structures. To the list of basic 
assumptions B + (i) through B + ( v ), we therefore, append the 
following assumptions : 


C + 2(i) A V u,v = 1,2 m (u A v) , 

C + 3(ii) is not known, u = 1,2, ...,m. 

Consider a random vector on which an observation is 
due to be taken. From the list of assumptions if we suppress 
C + 3(i) and 6 + 3(ii) for the time being, then we have a situation 
of the type discussed in Section 4,2. Now, it has already been 
established in that section that if E u is known then, under 
model u, the response vector Y n+ -^ would be distributed according 

to normal distribution* N^ (y^^ , W^), where 


W 


u 




u 


( n \ 

+ Kll GL. Z- >- 

n± k=l i=l j=l 


y*3 

u 


< (u) 

Sfk 


(uj'^vCu)’- 




) O 


(4.4.1) 


with as an (i,j)th element of ST 1 and x^, as given 

earlier in (4. .2. 5) and (4.2.8") .respectively. 


But, for the present discussion, has been assumed to 
be unknownj a situation similar to the one discussed in Section 
4.3, In this case we ought to exploit the sample of observations 

A 

for the estimation of the covariance matrix 2 U , so that if £ u 
denotes an estimate of 2 U , then ¥ u may be estimated by the matrix 


119 


¥ 


u 


it 


u 


¥ (u) 

^h+l 


Jk 

<r 


r 

> 


t 

k=l u=l j=i 


c J 
u 


(u) 


x.\ 

~ik 


x ( u )*) x^1 (4*4. 2) 

~jk } n+l J » 


where denotes the (i,o)th element of 5T 1 , Nevertheless, 

it may be borne in mind that the response vector, Y +1 , would 
no longer obey a normal probability law. The theory of Section 
4.2 is, therefore, not applicable in the situation thus arisen. 
In this case, as in Section 4.3, we use an r- variate t~ 
distribution : t p (^+i»$u» v u) for Y n+1 , where v stands 

A 

for the degrees of freedom associated with the estimate W^. 


Consider now two rival models, u and v (say). The 
p . d * f * of under these models is given by 


f n+l ^Xn+r^ 


« [• 


Vj+r 

r(-§— ) 


] |w. 


1 - 1/2 


{ r(-|)} r r(h) »[/2 1 




.1 . ~ 


-1- 


:(D,r ( V r)/2 


i = u,v. 


(4.4.3) 


Substituting for f^ and f^ from (4.4.3) in the function 



(4.4.4) 



where 


C, = [- 


v +r v +r 


{ r (|)} 2r r(^J) r (^r)(*/ v ) r/2 

_ , N r ~ «.q~ (v +r)/4 ^ - (i> +r)/4 

VW ■ b. * T u * u b u [1 + T v ,;1] 1 v , 

and 


l 1 / 2 , (4.4,5) 


(4.4.6) 


T - (v - v(i)\' tT 1 ^rr C r (i)\ • 

T i ~ ^£n+l Xn+1' I ^+1 “ £n+l^» 1 “ u » v - 


(4.4.7) 


We notice that the integrand, ( ^ 4 ^ 7 Jl+ ]_)» in (4.4.4) is of 
the same form as ^(y^) in (4.3.10) (Refer Section 4.3). 
Therefore, in order to simplify the integrand to a tractable 
form we can use the same algebra $ namely, that from equations 
(4.3.12) through (4.3.18) of Section 4.3. This reduces the 
two factors of Q 4 to the forms [Refer equations (4.3.18) and 
(4.3.19) in Section 4.3] 

n ~*(v +r)/4 ^ ^ 

[1 + y u v-l] u = eap (-•! ¥ u ) 2- A ± (T U ) v ; 1 (4.4.8) 


and 


[1 ♦ T v v ; 1 ]"^^ 574 = eX p (- $ V £ B d (¥ v ) v -i, (4.4.9) 
where 


121 


with a iks * s and s given in Table 4.3.1. 

Consequently, the integrand Q^, assumes the form 


, \ f -i ~ -ga °° 

Q 4W = exp { -^ (T u + V> ^ £5 VW C 


(4.4.12) 


where 


A ij ^£n+l^ - MV W * 


Further, using (A.l.l) and (A.l.2) of Lemma A.l (Appendix A), 
we have 


*U + - f <Xn+l -&!+!>' (Xn+1 ~ Si+1^ + 


where 


Xn+1 


(ir 1 + ir 1 ) (W ” 1 y(yh. 

11 v u xn+1 v £n+l. 


u 


So that, finally, from (4.4.12) we obtain 


‘VjW -te® { - 1 ( y£l - y£b' (S u 


w ) 

V 


-:L - y£l» ] x 


jfmmm 

ITo 


00 


2m ^ i 

i=o J 


(v -, ) v" 
£n+l u 



exp {- 1 ( Xn+1 - &,+!>' ^ + K 1)( LM ~ Zn+lH • 

(4.4.13) 

Having expressed the integrand in this fora, it is easy now 
to evaluate the measure of similarity, h^.In fact, if we now 
substitute for M&iM in (4.4.4) from (4.4.13) we get 



122 


h (f (u) f (v) ) - 
n 3 U n+l» 1 n+l ; ~ 


c 2 [exp {- l <££> - y(j)y (w u + w v )-l <§£> _ yXv) )}] 1^ * |-l/4 


oo oo 


f r > > A. . (y ) y -1 X 

J R r 15 ) 35 ) 10 V &>+1' U % * 

ex p { - i ( yn + i - W' ^C 1 + K}' 1 <y 


v ' 'in+1 iri+1 


)}] dy. 


i;n+l • 
(4.4.14) 


A look at the right hand side of (4,4.14) suggests some useful 
hints for the evaluation of ^(f^f^). Firstly, it suggests 
the integral involved therein can he evaluated term by term. 
Secondly, ma Y be considered to be normally distributed 5 

T yw *1 yv 

N r^Xn+l* W )» ^ '*(V^ + W^*')/2. This makes it possible to 

think of each integral as an expected value of the bivariate 
polynomial in the quadratic forms ¥ u and , as one observes 
from the expressions (4.4.10) and (4,4.11) for A. and B 

X J 

respectively. We thus have 




c 2 ( 2 ») r/2 [ |i u w v r 1/A Kir 1 + S^yzf^J * 


•» i- i ( y±l - + V 1 ^ - &C £ 


(4.4.15) 


15 : : »« 



123 


where 


’id 


E(A. . (y ) ) 


= 7” 1 >r b ~ (k+fc-s-t) 

Te=l fa. s^L fe. iks dat u 2i-k+l,2d-^+l 


and jL . = E(T a T b ) 
a,b u v 


Since W^ 1 and W” 1 exist, we can use the identity, 


w u (w u + w v ) _1 K - + vi; 1 ) -1 

and, finally, write the faction, h,,as 
h 3^ f n+l ,f n+l^ = 

A A 

c * [ T?TT? ll/4 exp [ “ i u ^i - «&>' - zS2>». 

' U V 1 

(4.4.16), 


where 


Cg ** [■ 


y..*fr v +r 

V 

2 ' * v 2 


2 r r (-»-■) r(- 5 -) 


] 1//2 >_ L_ 64 


r^) r(JV )( , u , v) r/2 i=o d=o v 

(4.4.17) 


and 


6 


i+1 J+l k J, 


±i -E I E E « iks P lJEfc u 

10 k-1 fcl s=l t4 lKS JSX 2i-k+l,2d-**l 


(k+^-s—t) 


(4.4.18) 

■ A/ 

For the evaluation of the mixed moments, ( a = 2i-k+l, 

h *> 2d-Ul) we resort to the f£ inversion formulae of 


■ 



124 


Cook (l95l) which, express the joint moments in terms of the 

joint omul ants* Now, it has already teen observed that Y 

~n+l 

may be considered to have a normal distribution N (y . , V ) * 

r ~n+l 

This fact when used, in Lemma C > 2. of Appendix G gives the 
required (a,b)th cumulant k . as 

Cl « O 


k a b = 2 a+b_1 (a-rt»2)l { (a+b-l) tr (R a R b + 

a f° U V 


< a Su * bp' w R a R b (a^ + tip - 

SS I Si Cf a ~b S3 1 r^p, rJVy *£ 

a ^u W R U R v flu - b «v W R u R v flv)> 


(4.4.19) 


where Si = (y^* - y^), R ± = IT 1 V , i = u.v. 


It may be noted that the remark made on the adequacy of 
inversion formulae in Section 4.3 is applicable in the present 
case as well. Besides, the function hy given by equations 

(4.4.16) through (4.4.18) has the same features as its 
unvariate analogue in Section 3.4 of Chapter 3. 




CHAPTER 5 


SOME THEORY OF ESTIMATION 


5.1 ESTIMATION OF MODEL PARAMETERS 

A reference to the design and discrimination criteria 
developed in Chapters 3 and A brings out the important point 
that the physical parameters involved in the response function(s) 
must be estimated in order to be able to assess the level of 
discrimination among the rival models and to design more 
experiments for the purpose, if the need be. In -what follows 
we, therefore, make a search for the appropriate objective 
functions which may be employed for estimation of model parameters, 
0 ^ u ^, (u = l, 2 ,...,m), in various types of situations, mainly 
depending on our knowledge about the covariance structure of 
error (s). 

With n data points at hand, the p u parameters of model u 
(say) can be estimated effeciently if a suitable method of 
estimation is employed. A wide choice of such techniques is 
available for this purpose. However, the procedure developed 
here being based on the maximum likelihood estimates and the 
observations being normally distribute*!, the estimation criterion 



126 


for different types of situations • Of the several forms of the 
criterion function an appropriate one is selected on the basis 
of the assumptions being used in a given situation^ 

Consider an r-response model (r > l), (say). Since 

the errors in each experiment have been assumed to follow a 
normal distribution, N r (0, £ u ), and the errors in different 
experiments have been considered to be independent, the 
likelihood of the n sets of data, under model u, can be written 
as 


L(e (u) ) = {(2>t) r | Ejr n/2 exp [- |tr {I- 1 R(8 (u) )}], (5.1.1) 


where 


R(e< u >) 


n 




with 


~k 


, ( . u) = &. - n M (s t ,e (u) ). 


zk>. 


(5.1.2) 


(5.1.3) 


In the sequel, we shall use R^ to denote R(®^ Ui, ) ; » 

We now recall that in the design and discrimination criteria, 
we advocated the use of the m.l.e. of 0^* Therefore, we 
seek such values, 0 V ' of 0 V ' which would maximize the 

(V fSS 

likelihood function L(0^). To that purpose we rather maximize 

/v 

the simpler function log L(0 ^), i.e., the function. 


>(u) 


0 


(u) 


) - - f log e ( 2 n) - | I 0 g e |E u t - | tr(r;V u >), 


(5.1.4) 


i(u)' 


Situation is If E u is known, maximizing a(£ v ') is equivalent 
to minimizing the function. 


M« (u) > - 


■1 


(5.1.5) 



which is f therefore, the criterion for parameter estimation 
yielding m.l.e. of 

r\> 


Situation 2 • When the responses in each experiment are known 
to he independent and measured with the same precision i-.e. , 

= ° U I> whether or not a * is known the m.l.e’ s of the model 
parameters can be obtained through the minimization of 

0 (e (u) )=]T J u ) e (u) [ = trR(e (u b]. (5.1.6) 

C. ™ k~“l ^ ^ 

Situation 3 * In yet another situation the covariance matrix 

E u may be unknown. However, if this matrix is known to be a 
diagonal matrix (meaning that the responses are independent, 
but measured with different precisions) the criterion for 
parameter estimation may be obtained by stagewise maximization 
of £(0^), now, given by 

rs* 


*(9 (U) ) - 


nr 

2 


log e (27i) - ~2 los e°U(ii" "2 hi ^ujii* 


(u) - E fr* - *J u) <i k .S (u) » 2 . 


vftiere R ±i - lJilt 


(5.1.7) 

( 5 . 1 . 8 ) 


At first stage we seek the solution of the equation 

til (e (u) ) 


n 


7> a 


ujii 


20 


R (u) 
* R ii 
+ — ■£- 


= 0 y 


uj,ii 2 


In fact y this equation has a unique solution j namely, 


128 


This expression, when used in place of a ..in equation (5.1.7) 

/ \ U| XX 

reduces £.(e'“ u '') to 



9 


_ nr 
= 2 


Thus the function SL(q 

A/ 

"by minimizing 


^ los e log e R^\ (5.1.10) 

^) can be maximized with respect to 


jZ 3 (e (u) ) JZ log e r£^. (5.1.11) 

I s 1 

The case of correlated responses, which are measured with 
different accuracies, can be dealt with similarly. We, as a 
first step, maximize l (©^) of equation (5* 1.4) with respect 

A/ 

to the covariance matrix £ u * This amounts to solving the matrix 
equation 


n y”l j, .1 r-1 — 0 

~ ~? u *2 u u 


© A 11 y J 

JT . ; " “ *2 St " 2 *u n "u = v rxr » 

Ll 


(5.1.12) 


where r x r matrix 0 denotes a matrix with all zero entries. 
This equation can be rewritten as 


r 1 = i r 1 r ( u) r 1 , 

1 1 inn IT * 


u n u " u 
So that on premultiplying and po stmul t iply ing both sides by 25, 

we get 




1 p (h) 


n 


=r t 


U 


( 5 * 1 # 13 ) 


129 


As a result, *(9^0) assumes the form 


£(e (u) ) = log e (2rt) - log e |R (u) /n| - •§ tr(R (u) X R (u) ) 

= - ^filogg^TO+i}* log e n - log e jR (u) | . 

(5-.1.14) 

Therefore, when £ u is a general unknown covariance matrix, the 
m.l.e. of cam. be obtained by minimizing 

0-a( q(u) ) “ |R(9^)|. (5.1.15) 

5.2 ESTIMATION OF VARIANCE (S) COVARIANCES OF ERROR (S) 

So far as the design of experiments for model discrimination 
is concerned, it may be noted that the estimation of certain 
variances or covariance matrices is equally important. We, 
first, obtain an estimate of the covariance matrix, E In 
fact, one type of estimator of £ u is already known to us from 
Section 5.1; namely, the solution to the equation (5.1.12). So 
that once the m.l.e. of 0^ have been determined the m.l.e. of 


-1 


E u assumes the form 




where 


£k 


(u) 


fee “ S (u)( Lk»S (u)) * 


( 5 . 2 . 1 ) 


( 5 . 2 . 2 ) 


We seek an unbiased estimate of E^» Therefore, in what follows 
we examine E for unbiasedness and remove the bias, if any. 


From (5 . 2. 1 ) } we have 


E 


(u) 


(E ) = 1 T E (u) ( e (u) e (u) ') 
u nf-, v Sk ~k '• 


k=l 

;(u) 


(5.2.3) 


Now, assuming that B does not differ much from the tine 

* r*s 

/ V 

value 0^ we can write, by Taylor's expansion of the response 
function ir u '(g v »0' u b around 

/^JK. rsj” 


so that 


y k = ^ (u) (£ k ,0 (u) ) + x£ u) (e (u) -e (u) ) + J u) , 

(u) = t(u)( 0 (u) _g(u) ) + e (u) _ 

#\^k k rsy /-v/lC 


(5.2.4) 


Besides, it has already been seen in Section 5.1 that when E u 
is known, the m.l.e. 0^ of 0^ is supposed to minimize the 

/v 

objective function of the form 

0 4 (e (u) ) = *(r ( u) ). 

This implies that at this minimum the first derivative of 
must vanish. That is to say. 


(5,2,5) 


aR^ 5 


We now recall that 


n 


= fei e ik > ’ 


v*ere e ±k = y ik - n| u) 


M: 


(u) 


On differentiating (5.2.6) with respect to 0 g , 


(5.2.6) 


we get 


' ik jks T jk v iks'» 


\J mC. i ( J 


where 


»TI <«> (| |8 M) 

X 'vK 


(5.2.8) 


Using (5.2.8) and the symmetry of R2 , equation (5.2*5) results 
into the relation 


£; x^ u) ' z< u > J u) = o 

k=l k 

where and are given by 

x (u) r^^lk’S^i 

\ = [ — rrm — J , 


9 (“).atu) 


(u) r 3*( R(u) ), 

z " 3 rW V u) .e (u) 




(5.2.9) 


(5.2.10) 


( 5 . 2 . 11 ) 


It may be noted from (5.1.5) that when 2 is known. 



so that 


«hile in case of as an unknown general covariance matrix, we 
have from (5.1.14) 


in this case will assume the form 


(5.2.15) 


Substituting for from (5.2.4) in (5.2.9) we get 


>_ X^' Z ^ [e, - X^ u) (9^ u) - 0^ u) )l = 0 

V— 1 K ~K K *vr j 


which in turn gives 


- 0^ = h” 1 > n xf u ^ e„ , 


where 


H = >2 X< u) Z (u > 3C^ U ^ 


( 5 -. 2 -. 16 ) 


(5.2.17) 


Going back to (5.2.4) and using the resulting expression for 
(e^ u ) - @( u )) from (5. 2.16) we get 


e,_ ' = e. 


v(u.) tt“\L r— v( u ) Z^ U ^ £ • 


(5.2.18) 


This expression for e^ u ^ is now utilized for evaluation of 

E ( u )( e ( u ) e ( u ) ) # i n fact, 

E^ u ) ( e ^ u ) e^*) = S u -x( u >H-l ^u) ^(uj^^^^uj^i^u) 4 . 

^IT 1 ^ x (u) ' z (u) ^(u) ‘ x (u) ) t r 1 x ( u ) # 


(5.2.19) 


Consequently, from (5.2,3) we have 


E(I U ) 


2 u - i ( t 4 u)irl ^ u) ' )z(uh ^i V (U)( £ 1 < U) ^ U) ’> 


“ fr^ u)irl( £i 4 u) ' z(u K z(u) x ' u))[rlx k l 




(5*2.20) 


' 

‘ iifei 





133 


It may be noted at this stage that the relation (5.2.20) between 
and £ u remains unchanged if is multiplied by a constant, 

since H would be multiplied by the same constant. Let us 
suppose for the time being that E u is known. Using the fact 
stated above, if we substitute Z^ = IT 1 in (5.2.20), we obtain 

e < £ u > - \ - i i 4 u)irl 4 u) ' + i£ 4 u) ** H rl 4 U) ‘ 

- *u - i £ ^ ^ ^ • <5.2.21) 

( n ) .A. 

This shows that E v v^ u ) is smaller than Z u and hence is a 
biased estimator. 


We now proceed to remove the bias in I , in an r-response 
case. In fact, we seek an estimate of 2 U which is some multiple 


of Z u so that 


(£ ) = v.. 2 . 
V U u u 


( 5 . 2 . 22 ) 


To that purpose we utilize the equation (5.2.21). This equation, 
no doubt, is based on the assumption that Z u is known but its 
use in case of unknown Z u can be justified on the ground that 
in the latter case is proportional to (R^ u ^) and the 

m.T.e. of I is proportional to so that Z^ ^ can be taken 

proportional to Besides, the multiplication of Z by a 

constant does not make any difference so far as (5.2.21) is 
concerned, since H would be multiplied by the same constant. 

The equation (5.2.21), therefore, remains valid even in the 


case when X u is not 


b known. Substituting for E 


for E (u ^(Z u ) from 


134 


(5.2.21) in (5.2.22) we obtain 


v 2 s E — — > tj—1 

u u u n k M \ 


Post multiplying both sides by Z" 1 we get 


i = i -ir x (u) h -1 iW'r 1 


u r r n 


a* 


Taking trace on both sides and using the fact that 
(X^ u ^ H X^ u) IT 1 ) is a square matrix, we, finally, have 

v u r = r - i tr [H- 1 (E ^ uV 2- 1 4 u) >] 


r - -i tr(H -1 H) 


So that. 




Consequently, an unbiased estimate of Z is given by 

s . (n . h)Y >) M . 

u r 1 ~k ~k 


(5.2.23) 


(5>2.24) 


We have thus seen that the m.l.e. of Z u is biased by the 

factor (l — that an unbiased estimator of the covariance 

matrix, E^, can be obtained through the moment matrix of 
residuals* and that the number of degrees of freedom associated 
with this estimation is the number of observations per response 
less the average number of parameters per equation. 


135 


Corollary 

In the uni variate case, we know from (5.1*13) that the 

o 

m.l.e. of ° u is given by 

n 


a 2 ( P ( u ) 2 

u ‘ n fcl k ) * 


But^ once the m.l.e.'s of 0^ have been secured, this estimate 

/v 

assumes the form 


n 


iM ) 2 

u “fci te k ) ' 


(5.2.25) 


where 


4 u) - ^ - ” <u) <*„.® (u) >] • 




Further, using the algebra similar to that of the multivariate 
case, it can be shown that this estimate is biased and that the 
bias can be removed by multiplying a 2 of (5.2.25) by (n-p u ) • 

Thus an unbiased estimate of o 2 is given by 


y ~ ( e (u) ) 2 

s 2 h k 

u n-p u 


( 5 . 2 . 26 ) 


with v u [= (n-p,,)] degrees of freedom. 


u 


5.3 ESTIMATION OF 2 U AND ^ UNDER 
DIFFERENT TYPES OF ASSUMPTIONS 

Multi' variate Models : From the discussions in Chapter 4 we 

find that so fbr as the covariance matrix of errors, IF, under 
model u is concerned,, it is Jp. two cases that we need its 


estimate. 





136 


Case (j-) : When Z u = Z , u = l,2,.,.,m, but Z is unknown. In 
this case the information about the equality of covariance 
matrices must be incorporated. To that purpose we pool the 

A 

estimates Z u *s, given in (5.2.24), and obtain an estimate of 
the common covariance matrix Z as 


Z = 


m n 

>_ >_ 

u=l k=l 


(e. 


(u) 

k 




m 

>_ 

u=l 


(n - Si) 
r ' 


(5.3.1) 


with v [® >_ (n - -~)] degrees of freedom* Since in this 
u=l r 

case £ u ' s are assumed to be equal, the estimate given in 
(5.3.1) provides us with an estimate for each Z u# 

Case (ii) ’• In another case, we assumed in Chapter 4 that 
I u / Z v , u,v * l,2,...,m (u £ v) and that none of these 
covariance matrices is known. The estimate of Z^ will then be 
simply given by 



n 

>_ 


k=l 




(n — ~^) 


(5.3.2) 


and has v [= (n - •—)] degrees of freedom-. 

Up X 

Univariate Models : While designing an experiment for a single 

response system we have seen in Chapter 3 that the estimate of 
the variance of errors under model u is needed in two cases 


Case (i) : When a 2 = o 2 , u = 1,2, ...,m, but o 2 is unknown. 

un i mm mm* UL 


.. 


-*s, an estimate of the 


137 


2 

common variance, c , can be obtained by pooling the estimates 
given in (5.2.26). In fact such an estimate can be specified 



with v [ = (nm - > p )] degrees of freedom. In the present 

u=l u 

case it is this estimate which is to be used as an estimate for 

each 

Case (ii) s °u ^ °v * u * v = l»2,...,m (u ^ v) with all o^*s 

as unknown quantities. The heterogeneity of variances in this 

A 2 

case does not allow the pooling of the estimates 0 ' s. 
Therefore, no further modification is required and the estimate 
given in (5.2.26) is used in the design criterion pertinent to 
this case. 



CHAPTER 6 


APPLICATION EXAMPLES 


6.1 THE SCHEME FOR IMPLEMENTATION OF THE 
DISCRIMINATION PROCEDURE 


The algorithms developed in the preceding sections for 
discriminating among rival models are now implemented in 
different types of situations. The technique proposed here 
is not only demonstrated through its application to linear and 
nonlinear models in r-response (r > l) situations, but is also 
compared with some of the procedures proposed elsewhere. As 
far as the data are concerned we resort to Monte Carlo method. 
The model that we use for simulating observations would mostly 
be one of the rivals. Once the data have been obtained this 
role of the model, assumed to be true, is suspended for the 
time being. The said model would then be considered as one of 
the competing models and would, therefore, be treated on the 
same footing as the other rival models. The next step in the 
proposed procedure would be the estimation of the parameters 


of all the models under consideration. This would enable us 
to use the discrimination criterion so as to assess the current 

l < 

position* it would be natural to expect the same model to show 


up as the best model 


for simulating the data, 

o : ■ ; ■ - ' 

MW III . 


Th .0 failure "to discriminate at this stage would mean the neces^ 
of designing an additional point, with more discriminatory 
power, through an appropriate design criterion. Another val u€ 
of the response can now be generated and used in the assessment 
of discrimination at this stage. This would be continued till 
the given procedure allows us to stop by declaring a model item 
doongat the rivals as the most adequate. The entire scheme t° r 

implementation of the proposed discrimination procedure is s& 0iwn 
in the flow diagram of Figure 6.1.1. All computations have t> eerl 
done on the Computer System * DEC 1090, using the programming 
language : FORTRAN « 


6.1.1 Simulation of Data 

As a first step we seek the data which can be used 
initially for discriminating among the given set of competing 
models. While dealing with a real life system the data can 
be obtained by carrying out a series of experiments. But> a*n 
the absence of such a system the experiments can be simulate^ 


on a computer. While simulation of experiments is 
indispensible in such a situation it may be advisible to 
employ it in others because of the advantages associated wi ^-^ 1 
its use. For example, even if it is possible to actually 
conduct the experiment it may be wise to, first, test a me'ti 10 
on computer simulated experiments and then apply it to the 


real life problem. This way we can determine economically 



th be successful in actual pra 
lethod for iranunity to sampling 


whether the method 



FIGURE 


.6.1.1 Scheme for illustration of sequential discrimination 












141 


fluctuations it is essential to repeat the whole series of 
experiments many times , 'which may be an expensive job if one 
plans to work on a real life system. This object, too, can 
be achieved through simulation on a computer at a far less 
cost. With all these considerations we adopt simulation rather 
than experimentation for acquiring data, initially, as well as 
at later stages if a sequential scheme of discrimination is 
followed. 


Now, an experiment, from our point of view, is merely a 
device which yields value (s) of the response (s) when an input 
of the controllable variable (s) is fed into it. We can, 
therefore, simulate it through an algorithm which computes the 
value(s) of r response(s) Y for the given sets of values of j; » 
using the formulae. 




) + e , 


( 6 . 1 . 1 ) 


where 17^^ denotes the functional form of the model, M (say), 
~ ( 0 ) 
assumed to be the true model for the hypothetical system, 0 , 

( 0 ) 

represents the set of values assigned to the parameters of M , 
and the error term, a , consists of pseudorandom number(s) 

/v/ 

drawn from ^(0,0 2 ) if the simulation corresponds to a single 
response system, and from N r ( 0 ,Z), if the system under 


consideration is multiresponse. 




from an experiment or through 


Whether the data are 


simulation, as we 


142 


parameters of all the competing models must be estimated before 


we proceed to discriminate. If the discrimination is to be 
carried out sequentially this task is required to be performed 
repeatedly in design and analysis. The estimates of the 
physical parameters as well as those of the other quantities 
obtained at a previous stage must be updated after each 
additional set of value (s) of the response (s) is appended to 
the data at hand. This makes estimation as one of the 
important steps in the process of model discrimination. If 
the aim is to estimate the model parameters, then an 

estimation criterion from amongst the proposed ones; namely, 

01 1 0 2 , 0y and given by (5.1.5), (5.1.6), (5.1.1l)» and 
(5.1.15), respectively, may be exploited; the choice depending 
on the assumption about the error covariance structure being 
met in a given situation. When the estimates of other 
quantities such as, 2 U , 2 , , and o 2 are required, the 

formulae in (5.3,2) , (5.3)1 ). (5.2.26), and (5.3.4), 
respectively, provide us with efficient estimators. 

6.1.3 Discrim inatiori among Mode l s and Design of .. Additiona l 
Experiments 


Having obtained the required estimates, the selection 
of the best model may be attempted. This can be done by means 
of discrimination index (2.4,1) which utilizes the distances, 
K ^ u ) f u - 1,2, ...,ra, from equation (2.5.15) or (2.5.17)* 
according as the system is uniresponse or multiresponse. If 
the discrimination does not seem to have teen achieved at a 

Wtm ' 


143 


certain stage, more experiments may be designed by maximizing 
the criterion function 0 ( ) of (2.4.5), formed through a 
suitable function from amongst h-^h^ and hy given in (3.2.22), 
(3.3.32), and (3.4.20), respectively, if the system is 
uniresponse and from amongst those given in (4.2.30), (4.3.28), 
and (4.4.16) »if the system is multiresponse. 


6.1.4 Optimization of the Criterion Functions 


While implementing the discrimination procedure laid down 
in this study the investigator will be encountered with the 
problem of optimizing a function, at two stages. Firstly, 
when the estimates of the model parameters are required to be 
obtained through an estimation criterion function and secondly, 
when a new experiment is to be set up through a design \ 

criterion. Posed in the form of fi lf 0 2 , fiy and the 
parameter estimation problem appears simply the minimization 


problem in the parameter space R 0 , in which the y's and £'s 
are the given numbers and 0 } s are the variables. On the other 
hand the design problem, proposed to be solved through some 
appropriate criterion function, is a maximization problem in 
the space, Rg , i.e., the operability range(s) of the input 
variable(s), where y's and e» s are treated as fixed numbers 


and. as the variables. Many of the tools of deterministic 

optimization can be successfully brought to bear on the problem 


of optimization. 

obtaining 


Because of the difficulty involved in 

res or the time required for 
of the objective function in 


computing numeric 3.1 derivatives 

. . ^ : ■ 





action especially, when the models are nonlinear and involve 
more than one function we prefer using a derivative free 

method. 

Now, while estimating the parameters in a mechanistic 
model the permissible space of the physical parameters may be 
known. This asks for constrained minimization of the function 
involved, with 9 s confined to the parameter space. To that 
purpose in this work we employ a multivariable constrained 
optimization routine, "Golden Complex Search' [Source i R.R. 
Hughes, Univ. of Wisconsin, Madisonj Language s FORTRAN], based 
on the 'complex' method of Box (1965). In addition to the 
basic algorithm of Box, this routine incorporates the following 
provisions (i) golden search between the reflected point and 
the discarded point, (ii) random generation of new points, if 
the collapse of the complex threatens, (iii) the random restart (s) 
with a smaller complex centred on the trial optimum (optima), 

(iv) the weighting centroid calculations by factors proportional 
to some power of the differences between the objective function 
values for the points and that for the worst point in the simplex. 

As regards the selection of optimal experimental conditions, 
the choice is generally not unrestricted. Therefore, searching 
for the maximum of <p involves constrained optimization, with 
the input variables confined to a bounded feasible region. In 
such a case, the grid search technique has been used, as the 
number of variables in the applications discussed here is small f 
being, 1 or 2. It may, however, be pointed out that when there 



145 


are 3 or more variables, the number of points on the grid may 
become too large* In such cases the maximum may be realized 
by using the above mentioned constrained optimization routine* 

6.2 DISCRIMINATION AMONG UNIVARIATE MODELS 
6.2.1 Example • Linear models; Known* Equal Error Variances 


We start with a simple example of discriminating among 
univariate linear models. Consider a situation in which the 
data are simulated through the polynomial, (l +£ + £ 2 ), and the 
most adequate model is to be chosen from amongst the four models} 
namely * 


M (l) s 

M ( 2 ) j 

m (3) j 


M 



t/ 1 ) = , 

r £ 


9 


n (3) = ep 5 ♦ 4 3 > 5 ♦ 

m i 2 



9 


Initially^ five settings of the independent variable, g , 
are chosen and a sample of five values of the response, Y, is 
formed through the relation 


y » 1 + £ +|^ +e » (6.2. l) 

where e has been assumed to be distributed as N.^ (0,l) . The 
results obtained are presented through runs 1 to 5 in Table 
6.2.2. Based on these data, the estimates of the parameters 
q (l) fl (2) q (3) Q (b) ^ through the criteria 

function. 


I t'JL '*.*'*] 


■ 

Mi N 8tilL'lNN : 



146 


/• \ 

0i ( 0 J ), and are presented in Table 6*2.1. This table also 

shows the estimates of other quantities, involved in the 
(u) 

distance, , given by (2.5.25). These estimates are 

calculated from the samples, supposed to have been drawn from the 

populations corresponding to the rival models. Having assumed 

that at the initial stage = 0.25, u = 1,2, 3, 4, the use of 

these estimates further enables us to determine the values of 
(u) 

, u = 1,2, 3, 4 as shown in Table 6.2.2. A comparison of the 
values, 0.6981, 0.1687, 0.0197, 0.1135, of the discrimination 
index, based on the preliminary data of 5 points not only brings 
to light the fact that is the best model (with the least 

value, 0.0204, of the discrimination index)but also indicates 
that with = 0.6981 is the worst model for the data 

simulated through a general quadratic. Thus the proposed 
discrimination criterion picks up the correct model, i.e. f the 
one which generated the data. 


In order to see the trend of discrimination index and the 
efficiency of the discriminatory design criterion, max 
in sequential discrimination we design some additional points 
using (3.2.23), To that purpose £ is constrained to lie in the 
interval, (0 < § < 4}, and a search is always made over the grid, 
(0.0 (0.2) 4.0}, of 21 points. The results are obtained in 
Table 6.2.2. It can be seen through this table that just another 
point, = 4*0, added to the initial data pulls the value of 
D^) down to 0.0005. On the other hand., the value of Dg- rises 

D 



Fable 6 * 2*1 

at 1 mat ion of model parameters at different stages 


Stasia ! Model ! 


Estimates of parameter 


4 * 5056 

1*4005 

4.9717 

0*6596 

0.8492 

1 * 4254 

0*9240 

4 * 7374 

■ 1*6768 

5*2481 

0*6672 

0 . 8222 

1*3927 

0 * 9382 

4*6652 

■ 2*0063 

5.3097 

0*6748 

1.0544 

1.6386 

0*8741 

4*5604 

• 2.5081 

5.4031 

0.6701 

0.8691 

1 * 4530 

0 . 9225 

4*7620 

■ 2.7862 

5.6423 

0*7027 

0.7601 

1.3660 

0.9651 


1*0310 


1*0413 


0*9805 


1*0288 


1*0740 




V'-K 


. 







, 1 , 




'M 


£i ' ^ O ^ < vf" ^ * 1 ™? /% 

ro ro cm c-i cm if; 000**0 
Hariri oc-io h 00 
♦ O + 0-0 ♦ o' ♦ 

* O ♦ Q + O + 0 ♦ O 

o^o^o^o^o^ 




1 m 

i 


♦ * 

1 *H 



Hi 

i a 



r-S 

1 © 

G i 


a* 

1 +> 

£L 1 


X? 

1 *H 

^ i 


D 

i e. 

I 


£ 

i U 

{ 


Hi 

1 

1 


rH OS 

I c 

I 


m s- 

1 0 

✓-% <TN j 


*rt 1 3 

E X? 

i -h 

1 4-* 

} 

CM i 


0 cy 

1 to 

G J 


c 0 

1 c 

Cl 1 


a 0 

f *rt 

^ i 


r-4 

i a 

i 


0 a 

f *H 

1 


a 

1 u 

i 


1*4 r-»» 

1 u 

l 


tn rHf 

1 Hi 

^ ^ i 


C *H ^ 

1 *H 



q rr-Cr 

1 O 



E I » 

1 

G X 1 

r-4 

ro X — 

! 

a. s 

♦ 

0 

1 

~ ( 

CM 

C A r-4 

t 

1 


0 m 

i 

I 

nO 

*ri m X) 

i 

i 


+* JC 0 

I 

i 

m 

to 4 ^ e 

I 

i 

«— 4 

c 

1 

i 

jq 

*rt -O 2J 

i 

l 

US 

SEE Lm 3 

| mm mm *»< 


|mm» 

•rt IQ 5- 

1 

1 


t- H 

I 

1 


CJ +> •** 

i cy 

; i 


Hi C 

I Hi 

i 


*H IH 

1 c 

1 


X? Hi 

1 0 

ij 


CU 

i a. 

£ 


iHI t* 

1 at 

1 


<0 a. 

1 cy 

1 


*fHt 

1 cc 

i 


4> Hi 

i 

i 


C JC 

1 — — - 

** ““ * ,,, ’ 


cy +> 

1 

i 


□ 

f Si 

i 


© 

i *~i 

. i . 


© 

I -P JQ 

M ! 


CO 

1 3 m 




1 ft. *H 

xXA i 



1 C S~ 

1 


O^sQOCDOiilOOON 

HMO0DCNO!>O(h 

0+0 + 0+ 0+ 0 + 

♦ O + O + O + O *0 


rs. ^ ro r\ ^ t-* ^ c-4 ^ 

GjO^Of^O-r-iOOO 

^oo<rooooooo 

T-i * O + 0+0 *0 + 

+ o + o + o *0 * o 
O^O^O^O^O^ 


T-j C *4 ^ CN ^ Px CO ^ 

COOrOOCNOGDOCNO 

^o^osoo^c^o 

♦ 0 s * 0 s ♦ O’* - * 0 s * ♦ 

♦ o * o ♦ o + o ♦ o 

o ^ O ^ o 'w O ^ O v -** 


1 — i LH O \D Ci Cv: c*i N 

ro n >0 <r <r ck ro ro r *4 co c -4 -*-* 

<t LiT li" N ^ ^0 CO N tH T-i sD N 


c-4 r\ th o o o 
T-t C -4 CM CM 


ill * 


<r t-« 


T-i T-i 

r-4 — 


OOOOO OOO'CO'COOO 

4 , T ♦ T ■ *► <♦> ' <■ ♦ ♦ 4* ♦ 4* T> 


i Ht ID 
1 


i 


. 

■!■ . . 


i 

*4 


CC 2 C 


n&t 


cm ro <r in 'O r\ 00 ; 


o- 



h 



I4y 


necessity of designing more points. However, 4 additional 
values of 5 are acquired through the design-criterion function 
(3.2,23) so as to check the validity of the claim made at the 
previous (sixth) run and see the trend of the criterion. This 
is depicted through Figure 6.2,1. The graphs in this figure 
show that the value of keeps dropping fast as more and 
more discriminatory points are added to the data. It also 
points out that is the closest rival of M^. Besides, 

through a close look at the values of discrimination index in 
Table 6.2,2, one may notice that the models have been ordered 
according to their appropriateness. Such an information may 
prove to be useful when cost of using a model is a consideration 
and it may be cheaper to use a model other than the one selected 
through the criterion. In this case one would naturally look 
for the second best. 


Box and Hill (1967) have considered the same problem of 


discriminating among four polynomial models. The results 

obtained by them are presented through the bracketed values in 

Table 6.2.2. Looking at these values one notices that after 5 prel-i 

lminary data points the posterior .probability no doubt indicates j 

that with 0.66 as its probability, is the best model but ; 

not as clearly as the discrimination index does. As one moves 

(3) . 

further one notices that the altitude which P^ + attains at 



suddenly drops to 0,75. This 
to whether to declare this .model 

ron further at the next run. 


the sixth runj being 0.88, 
leaves one double minded a 





X3GNE NoixvNrwiyDsna 


lt)U 





rCQEL 3 


SEQUENTIAL RUNS 


FIGURE 6.2.1 Status of models as determined by discrimi 
nation index $ Linear univariate models. 
True model • Model 3. 





151 


Fortunately , in this problem it rose to 0,90 and. still higher 
to 0.97 when according to Box and Hill (1967) one could take 
a decision that M w ' was the best model. Nevertheless, because 
of the fluctuating nature of their discrimination criterion 
it was not safe to declare as the best model just on the 
basis of the initial set of observations or even after 3 more 
points had been designed. It actually took them 9 runs, in 
total, to take the final decision. On the other hand the 
behaviour of the discrimination index being more stable it is 
safe to take the decision even on the basis of 5 preliminary 
observations, if the need were. 


Furthermore, it may be noted that the Box- Hill method 
puts more stress on bad models. For example, at the first 
instance itself the probabilities of models 1 and 2 are 
rendered very lowj being, 0,00, 0,01, respectively. Even at 
later stages the fall in the probabilities of these models can 
be seen to be much faster as compared to the rise in the 
probabilities of better models. On the other hand^the present 
procedure gives more importance to the better models as can be 
observed through the values, 0,6981, 0.1687, 0.0197, 0.1135 
of the discrimination index to start with and in the lat er 


stages. 



*«»*§*! V"; 


to the usefulness of the procedure 
li the adequacy of the selected 
stsce where we decide to stop we 


Another point which adds 
proposed here is concerned wi 
model. At the 4th sequential 


as low as 0,0002* This value of V is insignificant as compared 
to the 5% point, 3*841, of the Chi-square distribution with 1 
degree of freedom* We, therefore, conclude that model is 

not only the best model, as decided earlier, but also an 
adequate model for the data simulated from the normal distribution 
with expectation given through the polynomial, 1+g+l-^, and 
variancej 1.0, To be more specific, the model, 


Y = 0.7027 + 0.7601 § + 1.0740 § 2 , 

as seen at the termination of our procedure is a ready-to-use 
model in the given set-up. 


It has been observed by many investigators that the points 

designed for discrimination through some of the procedures are 

not good for estimation. Therefore, the parameters of the model 

selected through these methods normally require polishing 

before one can put it to use. It can be actually seen through 

Table 6.2.2 that the Box-Hill procedure in this example always 

results into 0.0 as the optimal value of § • On the other hand, 

6 

the present procedure generates points such as 1*6 and 4.0, 
when the design space has been considered to be C.O < £ < 4.0. 

In fact, while working on the real life systems, it Is 
recommended that occassionally experiments be chosen in the 
interior of the region, even when not prescribed by the design 


criterion. 

Lastly, we discuss the role of weights, w u>vjk » 131 designing 





the problem under consideration. It may 
yg.2 or in Figure 6.2.1 that the pair, 

-JC ' > ' 1 r ' ‘ ’ „ ' , / v> ' 


additional experiments for 

be observed through Table i 
(M^),m( 4)) # is found to b< 



FIGURE 6.2.2 Change in weights, w U(V;k , from first 

to second stage; linear univariate 
models. 


154 


Table 6*2.3 

Heights used in the criterion function 0 at different 
stages? discrimination among polynomial models 


+ * 

— 

— 

— 

! Seaue- ! Model 


Model 

1 

intial ! v 


u 

1 

! stage ! 



I 

! k • 

1 

o 

4. 

3 1 

i i 

1 1 

i i : 2 

0*172998 


t 

1 

t 

1 

! J 3 

0*020281 

0.083937 

1 

i I 4 

0*116371 

0.481631 

0.124782 ! 

I < 

12! 2 

0*078665 


1 

1 

; ! 3 

0*000972 

0.020478 

1 

1 

! I 4 

| ;f 

0*038918 

0*819574 

0.041392 ! 

1 1 

! 3 r 2 1 

0*023600 


1 

I 

: i 3 

0*000032 

0.004063 


! 1 4 

0*074030 

0.951948 

0.012954 

. . ' ' i 1 

; 4 ! 2 

0*007617 


1 

1 

; ! 3 

0*000001 

0.001005 

1 

1 

i ! 4 

0*001114 

0,983388 

0.006874 ! 

1 

* » 

* * 

15! 2 

0*002173 


1 

! i 3 

0.000000 

0*000261 

1 

I 

1 ! 4 

1 1 

0*000150 

0.993641 

0*003775 ! 



155 


at the end of the fifth run, receives most of the attention, 
i.e,, 4854, in designing the 6th experiment. Similarly, the 
pair (M&\m< 3>) consisting of the farthest models have been 
given the least importance through a weight of 0.020281, only. 
The other pairs, too are given their due through the weights, 
0.172998 0.116371 (M^,M^), 0.083137 

and 0.124782 (M^,M^) in designing the sixth setting of § . 
The values of weights assigned to different pairs of models 
at various sequential stages are shown in Table 6.2.3, where it 
can be easily seen that throughout the design process the 
pair kept on receiving the highest importance, while 

the pair has always been given the least weight 

as it never required further divergence. This way a look at 
the values of the discrimination index at any stage, in Table 
6.2.2, clearly indicates that the proposed weights appropriately 
decide the role of a pair of models in designing an additional 
experiment. 


Finally, in order to examine the proposed discrimination 
criterion for its immunity to sampling fluctuations and attach 
as well a confidence level with our contention, we have 
simulated 500 samples of the response using random samples of 
the values of £ , uniformly distributed over its operability 
region, { 0.0 < £ < 4.0}. Jfe find that 489 times the criterion 
picks up the correct model* although, sometimes discrimination 
based on a set of initial data is not found to be very sharp. 

As regards the remaining 11 samples, the distinction among models 


' : - . O' 



156 


is not seen to be clear, although none of these have shown M 
to be a bad model. 


( 3 ) 


6.2.2 Example $ Nonlinear Models i Unknown , 

homogeneous Error Variances 


The following example has been considered by Buzzi 
and Forzatti (1983) for demonstrating the implementation of 
their discrimination scheme and for its comparison with the 
Box-Hill procedure. We include it here for comparison of our 
method with that of Box and Hill (1967) a nd with the one 
given by Buzzi and Forzatti (1983). In this example we, 
therefore, consider the same set of five kinetic models j namely, 

,2 , fc m(1) 

j^(l) j - 


6-^p - (S,/ G i ) 


(2 s i-,%1 - tS,/Si (2) ) 

* * = te!p * 4 z) i7+ ®4 ZJ 6 2 + 


M 


M 


(3) t v (3) 


ie 


13 7 




+ h + § 2 


(4) _(A-) 

M v ' * 


ie 


w 


jiil 


7 4 4) Sl * <®fVs 2 > + e 5 4 S } * 


mx 




M 


(5) , „(5) . 


Sf 7 ^ E 2 *^'¥ 2 + 8 5 'b 


mx 


rmr 


W 


These models have, In fact, been originally proposed for 
the process of synthesis of menthol from carbon monoxide and 


157 


hydrogen. We, however, like Buzzi and Forzatti, carry 
out a Monte Carlo study and simulate experiments through the 

model 


5 I s 2 


58824.0 £ 


(1704.0 + 4.25 § 2 + 0.241 + 444.6 % 3 ) 


2 + e » 


( 6 . 2 . 2 ) 


with e as a pseudorandom number from N(0,4,0 xicT 8 ). While ! 
designing the optimal settings of and - $3» we have 

likewise restricted these variables within their respective 
operability regions? namely, {15 < ^ < 25}, {200 < < 250}, 

and (5 < < 10}, The corresponding observations are 

simulated through the model (6.2.2). So far as the initial 
set of observations is concerned we have used the data 

2 

simulated by . Buzzi and Forzatti (1983), based on a 3 
factorial design of §i*§ 2^3* 0n basis of ' these 8 


simulated observations the Box-Hill method produced posterior 
probabilities of five models as, 0.3000 , 0.0007, 0.0400, 
0.3290, and 0.3290, which hardly show any distinction among 
j^(l) , j,j( 4) ^ M^^. Even the estimates, 4*83 *10 , 

16,8 xlO" 6 , 8.78 xlO -6 , 4.65 xlO -6 , and 4.65 of 

Sg U ^ ’s in the method of Buzzi and Forzatti.whioh are claimed 
to, atleast, screen out bad models, do not seem to do the job. 
The seme set of data when used in (2.4.1) gives the values of 


the discrimination index as 0.0501, 0l1 

0.0005, for the models V™ ,* W . M ’ 

respectively • These 


of M (5) , i.e., the 



y indicate the superiority 
the data and the 


158 


inferiority of At this stage, both the posterior 

probability and the criterion of Buzzi and Forzatti 
no doubt show that M (2) is a bad model but are not even 
slightly indicative of the reality which we are actually 
interested in revealing. Thus, one can see more discriminating 
power of the discrimination index as compared to the other two 
criteria, when only initial set of data have been used. 


The inability of the other two discrimination criteria 
in choosing the best model at this stage necessitated the 
designing of additional points as was done through their 
respective design criteria. The results obtained are presented 
in Table 6.2,4, where the values bracketed as ( ) are obtained 
through the Box-Hill procedure and those bracketed as [ ] come 


through the discrimination scheme of Buzzi and Forzatti. 

( 2 ) 

It may be noted from this table that the values of and 

keep decreasing throughout the sequential process and 

& ( 2 ) ( 3 ) 

finally go as low as 0,0, thereby showing that M and M 

are poor models. On the other hand, remains the least 


favoured model upto the 11th run only, after which it starts 

(4) 

showing up as the best model? the preference shifts from M 
and to Whatsoever, the Box- Hill procedure could not 

pick up the true model. Instead it declared M as the most 
probable model. Similarly upto a considerable number of runs 



the values of y , too, do not seei 

fact that is the correct model 

c (u) 

as late as 30 th the values of bjq 

U,. U ; .'. 





Table 6*2*4 

Bmotuentiml discrimination amonm nonlinear models t the 
present? the Bok-HIII » arid the Buzzi et al. procedure 
( True *od©l t ^(5) * 


159 


ia 

-w '*** M 

Qd.cn 


O O n 
o m r-4 
o in is 
o m * 
* * m 

o O Li 


O O n 

o in cm 
o n'w 
o m * 
♦ * m 

O O LJ 


O O n l 
O O no i 
O O O ! 
0 0*1 
♦ * m ! 

O O Li i 


w ' PS wV A! 

q dr cn 


K\ KN 

Q Cl cn 


CM CM CM 

q d cn 


Q d cn 


mo n 
t-i r-4 m 
cn iin rs 
o m * 
* * m 

O O Li 


o r-! 
■r-i CM in 
o r-4 m 
o m * 
* ♦ in 

O O Li 


r-4 O n l 
05^0 I 
O S3 T-t ! 
O 04 * 1 

♦ ♦mi 

o O Li f 






w 








•w' 




















CO -f n 

m 

m 

n 

o 

N 

n 

in 

T-i 

n 





S3 O 00 

O 

o 

S3 

•»r 

O 

00 

o 

o 

-r-i 





<T O !S 

m 

o 

r-4 

O 

o 

CD 

o 

o 

CD 





T-t O * 

O 

o 

T 

O 

o 

» 

o 

o 

♦ 





* * 00 

♦ 

«* 

00 

-* 

♦ 

S3 

* 

* 

S3 





O O Li 

o 

o 

Ll 

o 

o 

Li 

o 

o 























n 



rt 








ill IS n 

rs 

o 

o 


o 

o 

<r 

o 

rt 





r-4 o od 

<r 

o 

o 

<r 

o 

o 

CN 

o 

in 





o o ♦ 


o 

* 

CN 

o 

«. 

d 

o 

♦ 





IS O so 

CN 

o 

03 

CN 

o 

ID 

CN 

o 

CN 





♦ ♦ T-i 

♦ 

♦ 

H 

♦ 

♦ 

T-i 

‘ * 

+ 

TT-f 





O O Li 

o 

o 

Li 

O 

o 

LJ 

O 

O 

LJ 













w 








ys. 












T-i O r-i 

N 

O n 

r-4 

o 

I~1 

S3 

o 

n 





o o m 

m 

m 

is 

o 

r-4 

T-i 

l 

CD 

CN 





in o oo 

o 

CN 

CD 

o 

C-4 

T-i 

Ui 

o 

S3 





om * 

o 

r-4 

* 

o 

m 

♦ 

O 


* 





♦ ♦ 

♦ 

+ 

n 

♦ 

* 

W 

* 

* 






O O Li, 

o 

o 

Li 

o 

o 

Li 

T-i 

o 

Li 


























n 

T-i 


n 


<r 

r-i <r 

is 


r-4 

o 

© 

o 

o 

o 

o 

o 

O 


o 

<r >o 

cn 


o 

m 

N3 

H 

o 

o 

t-i 

m 

m 


T-i 

O N 

o 


r-4 

ID 

m 

CN 

in 

hi 

O 

o 

o 


♦ 

♦' ♦ 

♦ 


• ♦ 


♦ 

* 

♦ 

* 

♦ 

♦ 

* 


o 

T-i O 

T-i 


r-4 

r-4 

r-4 

r-4 

H 

T-i 

Ci 

tHI 

T-i 








*“* 


~ 

Li 


N "^ 

Li 








ri 



t~i 



n 

9 

9 

>0 'CO' 

CN (N 

cn 


in 

iin 

li*3 

in 

li*3 

hi 

hi 

O 

O 








LJ 


-W- 

Li 


t-1 

T— t 













w 

LJ 








n 



n 



■r-t 

o o 

o o o 

o o 

o 


r-4 

o 

O 

o 

o 

o 

S3 

o 

O 

Y-{ T-i 

r th 

ih <r 

*c 


o 

ro 

m 

in 

in 

hi 

O 

hi 

li*3 

CN CN 

CN r-4 C-i 

r-4 n 

C-i 


CM 

r-4 

r-4 

CN 

r-4 

C-i 

CM 

C-i 

r-4 







— 

Li 



Li 



Li 








■n 



n 



'in 

is m 

is m rs 

m rs 

m 


m 

li*3 

in 

<r 

m 

in 

<T 

in 

in 

T-i (N 

th r-4 r-i 

r-4 th 

C-i 


C-i 

r-4 

r-4 

r-4 

T-i 

T-t 

r-4 

C-i 

C-i 







— 

Li 



Li 


n 

Li 

T-i r-4 

m <r in 

S3 rs 

CD 


CN 



o 



T-t 











'T-t' 



T-t 





(i) Values brack ted as Cl are to be multiplied by 
<i:L> l.«OE*i*k stands for 10~ k ♦ 



Ibu 


4.16x10 15.4x10 4.46 x 10”^, 4.l4xlCf 6 , and 4.21 xl0“ 6 , 

and tho prescribed test on these values do not show any 

distinction among the closer models, i e. and 

(5) * 9 9 

M , By using the procedure of Buzzi and Forzatti 

we only end up with the conclusion that the models, 

(4) (5) 

M , and M can all be considered to be appropriate for 
the simulated process^ according to them these models are 
equivalent on statistical grounds. This conclusion which has 
been made possible after 30 experiments could have been drawn 
through the discrimination index at the initial stage itself. 

In fact, from the preliminary data only the index could extract 
much Information l it not only showed the equivalence of 

and but also established the truth that is 

the model which seems to have generated the data. 

It is important to compare the three procedures at the 
11th run, for it is at this stage that the present procedure 
may be terminated. A close look at the values of the three 
discrimination criteria in Table 6.2.4 show that at this stage 
when the Box— Hill procedure just starts favouring rather 
model, M^, and the method of Buzzi and Forzatti has just screened 
out one mo del | being the discrimination index has already 

shown enough evidence in favour of as well as given a | 

clear indication as to which of the rival models are close j 

to M^. This exhibits the potential power of the procedure 

proposed here. 



161 


zero 


It may further be noted that because of pi^ being 

( 2 ) / 30 

and S^q (as 15.4x10 ) being significant as compared to the 

estimate of the experimental error variance, 0 ^ ( = 11.3 jclO*""^) 

the other two procedures have been quite decisive in 

{S) 

rejecting M * both the procedures seem to put more stress 


on the identification of the bad model rather than distinguishing 
^ from and However, the results obtained through 

the two methods lead to different conclusions 1 whereas one 

r (l ) 


over all other models, the other considers 

(5) 


prefers model M' 

it to be the same as and M^. But, the fact that M' 

is the model which have been generating the data ought not be 

ignored. 

So far as the performance of the proposed design and 

discrimination criteria in identifying the true model is 

concerned 3 more additional experiments have been designed, 

although picture had been made quite clear the the 8 tb run 

itself. The results obtained are shown in Table 6.2.4. It 

can be seen from this table that the decision on the best model 

remains invariant throughout the sequential process. , One can 

declare as the true model at any stage j in fact, more and 

more decisively as one moves from one stage to the next. 

(2) 

Furthermore, it is easy to see that barring M , the other 

r(5) 


models, i.e., M , 


(1) m (3) 


and are close to M' 


Finally, when we stop our procedure at thfe 11 th run 

{ * 5 } 

according to the «**■«*& storing rule and declare M as the 



162 


{5) r 

best model with D^q = 0,9x10 D we also establish that this 

model is adequate for the simulated process. In fact, the 

value of the statistic V for M^, being 0.00001, is much 

less as compared to the 5% point, (x^ = 3.842), of the 

Chi square distribution with one degree of freedom. This 

(5) 

enables us to accept the hypothesis on the adequacy of M . 
Similarly, when the values, 0.0118, 0.0910, 0.9630, of the 
statistic V for models M^, and M^, respectively, 

are compared with the above tabulated value we conclude that 
on statistical grounds these models can also be considered to 
be good. The use of the X^ approximation in testing the 
above hypotheses is justified by the small value of w 2 j being 
0.00023. Thus the roady-to-use form of the model may be 


specified as 
E(Y) 


{M| -(1^0.0002)} 


nns y — — r? . 

{1704.0015 + 4.25 l 2 + 0-2408 5 lh + 443-9291 g 3’ 


In addition, on the basis of the discrimination index we also 
conclude that is the closest model to M 


Figure 6.2.3 is given here to present a picture of the 
progress in discrisdnation achieved in this problem through 
the proposed procedure. It may be observed from this fagure 
that the graph 05 always remains below the graphs. Oi 
G3 , „4 , and 02 . This establishes the superiority of 

M^/over other residual models throughout the sequential 
However, the falling trends of 01 , 03 , and 


G4 




wmmmM 


S 




DISCRIMINATION INDEX 


* v v 



112 3 4 

SEQUENTIAL RUNS 


FIGURE 6.2.3 Progress in discrimination through the 

proposed procedure* nonlinear univariate 

models. True model * Model 5 . 



considered to be good models. On the other hand, G1 keeps 
rising consistently. This confirms that M (2 > is a bad model. 
These are some of facts which have already been proved through 
the proposed adequacy test. 


As far as the role of the weights, w u in designing 
new experiments is concerned it can be seen through Table 6.2*5 
that the weights attached with different pairs are determined 
according to the position of a particular pair relative to all 
other pairs. For example, in Table 6.2.4 we find that the 
farthest models are and while and make a 

pair with the closest models. Accordingly, while designing the 
9th experiment the pair (M^, M^) receives the least weight, 
i.e., 0.000269 , while the pair (M^, M^) is attached with 

the maximum weight of 0.347799 as can be seen through Table 
6.2.5. The next closest pair being (W^, M^) is given a 


weight of 0.255276 . Similarly, the other pairs, too, are 
attached with due weights. Figure 6.2.4 gives a pictorial 
view of the allocation of weights for designing the 9th 
experiment. It can be seen through a combined study of 


Tables 6.2.4 and 6.2.5 that at every stage heavy weights are 
assigned to models which are yet to be diverged. 

6.2.3 Example . ■ 


Consider the problem of discriminating among the following 
four models proposed for a chemical reaction . 



Table 6.2*5 

Weight, *5 for the criterion function 0 at different stasfes 
discrimination among nonlinear univariate models 


Seeue- 

* 

1 

Model ! 


Model 

ntial 

! 

V 1 



U 

stage 

! 

1 

1 




k 

! 

1 

i 

1 

9 

3 

1 

i 

1 

* 

1 

O { 

0.036390 




♦ 

* 

3 ' 

0.174152 

0.106522 



1 

4 i 

0.255276 

0.072670 

0.347799 


I 

1 

lg* 1 

a * 

§ 

0.003768 

0.000269 

. 0.001287 

2 

t 

i 

* 

2 ! 

0.003732 




i 

i 

~I ! 

a i 

0.116208 

0.030357 



I 

I 

4 ; 

0.165820 

0,021274 

0.662385 

t 

i 

i 

s ; 

0.000171 

0.000001 

0.000021 

f 

f 

! 3 

1 

1 

1 

* 

1 

o * 

A * 

0.000356 




* 

I 

3 S 

0.089144 

0.007287 


1 

1 

4 t 

0.250230 

0.002569 

0 .650377 

* 

t 

1 

i 

| 

* 

nr i 

vl * 

0.000009 

0,000000 

0.000000 


+ 


4 


0.001887 


0.000030 


0.000001 


+ 



FIGURE 6.2.4 Change in weights, w u v . k) from first to secor 

stage ; nonlinear univariate models. 




jyj(l) 4 ^(l) _ eX p eX p ( 0 p) epH 2 )} > 

M ( 2 ) t ^(2) - {i +| 1 exp (a[ 2 ^ * ep ^^” 1 * 
m ( 3 ) . n (3) _ ^ + 2 ;^ exp (ep^ - 0 P^2^ > 

m ( 4) . .Jb) = {2 + 3g 1 'exp (ep^ - 0 P^))” 1 ^ 3 • 

It is the amount of reactant which is supposed*be measured. 

in the reaction under consideration. Besides, £q_ is the 
time, t, and g £ [. % - ^)] is the scaled inverse absolute 
temperature , with T representing the temperature. In actual 
experimentation the experimenter would probably like to control 
the input variables t and T over the grid {t = 25(23)150, 

T » 450(25)600}. In the simulation of the problem, as we 
carry out here, the same grid is used for the selection of time 
and temperature for generating observations. Further, we 
assume that M (2) is the true model and accordingly simulate the 

amount of reactant, y using the equation 


M (0) , y . [X + Sl e*p {-(3.53235 ♦ 5000.0 t.,)}]" 1 + (6 ' 2,3) 

with e being a pseudorandom number from 1^(0,0.0025). »*■ 

set-^ is the same as used b y HiXl et al, (1968). To start 

.. e Emulated by them corresponding to the 

with, we use the data simuiaxea y . 

settings of t and T according to a 2^ factorial design. After I 
securing estimates of the palters in all the four coveting 

4 > 4- "hen 4 grT I TH XTIctiS 3,0 H. H1(3.0X iOT "fcll€ 

models through the values of the drscnmr 

. . .... ho 0.4x10, 0.0071, 0.2295, and 0.^324. hu. 


models are 


X ou » 


tb« preliminary data shows large differences between the value 
of d£ 2) and that of D^, D<3>, and D< 4 >. This is Indicative 
of the superiority 0 f M (l) . On the other hand, with the same 
preliminary data the posterior probability formula of Box and 
Hill gives 0.0060 , 0.4335 , 0.4087 , 0.1518 as the respective 
probabilities of the four models. 


When another experimental setting, namely (150 , 525) is 
decided through a search of the maximum of $ over the prescribed 
grid, utilizing of (3.3«32) * the values of the index for 
and rise to 0.5108 and 0.3397, respectively, 

drops to 0.0003; and decreases to 0.1493. It clearly shows 

(p\ 

that M x 1 is the best model for the data simulated through (6.2.3) 
and that is its closest rival. The Box>- Hill criterion 

suggests different settings of t and T, i.e., (150, 550). The 


posterior probabilities "which are. 'obtained as* result do not seem 


to differentiate reasonably well between and M^j P^ 2 ^ and 

P,p) being of the orders 0.5580 and 0.3740, respectively; 
no doubt has been clearly shown to 'be a poor model with 
pjp-) « 0.0004. The progress in discrimination through the two 
procedures can be studied through Table 6.2.6, where the 
bracketed values come through the Box Hill procedure used by 
Hill et al. (1968). Figures 6.2.5 and 6.2.6 show the trend of 


discrimination through the two procedures from one stage to 
another. In fact, these figures give a pictorial view of the 
changes in the relative status of the competing models as decided 
by the two discrimination criteria. The procedure proposed here 

4 . , j*. - - ’ ; V^sC 1 , , ■. - -■ ■■■ 



Table 6.2,6 

Seauential Disc ri mi na talon aaons nonlinear univariate 
models* the present and Box-Hill procedures 
(True model* M vM ) 


G 


Cl 


IB 


i 

'*4 


1 

feme. 

KN 

I 

m 

tA 

l 


£L 

l 

*rt 

O ^ 

i 

ft— 


i 

u 


l 

t 

c 


i 

o 


t 



I 



i 

fla 

X 

✓-N ^ 

i 

l 

‘H 

G CL 

l 

€ 


1 



l 

ft- 


1 

U 


i 

ifl 


i 

i 

•H 


G 


l 




^roors'Ctr 
0-4 -H O- N CO 

w in re S3 in 

WrtMOtt 
♦ ♦ ♦ ♦ ♦ 
o o o o o 


ion rooro 
o- oo cs oo 
cm o n cs 
hmo 


III tH 

K nO 
CM MD 
O M "O 
♦ ♦ CD 
OO 1 
~ CL 

o 

m 


LOiil OOOOOOO 
^OOOOOOOO 
iHOOOOOOOO 

ooooooooo 

ooooooooo 

N^ N >>/■ \«/ 


y\ 


"G N 

<r w 
>c 
ro o 


mooincMNMin-c<r 
N O CM O O O O O 
U1 OOOOOOO 

oooooooo 


CM 
o 

CD M 

C- 


ooooooo r* ooooooooo 

■w V Srf* 10 *W 'w' V N^> «w» ^ *W" 

I ' CD 

o 

o 


-h in ro o o 

NfOOODO 

o m o in o 
o <t o in o 


£. . ys 


CG o 

N O 
CM O 
O 


tn <3- in co ro r*. ui >o 

^CMCNfXCNOCNCNCN 

CO^OOOnOnCSOnC^^ 

'•OOnOnOnOnOnD^O^On 


-p 

c 

♦ * ♦ ♦ * ♦ ♦ CD ♦ ♦ 

ooooooo moo 

-W V-/ "W* Qj V ^ 

c- 

01 

O O 03 ^ fO thcmxiooooooooo 
,H<joomoc-pQOooooooo 
BO rtO<f ON OOOOOOOOO 

*r o in o in © in ooooooooo 


o o o 


/% /\ o- ^N /N 


OOOOOOO 


o o 

w w 


o o o o o 


*4- 

I 

i 


m 

m 

.Mm 


i 

1 

l 

l 

*i CM D1 N N lil <t *? Ill CM 

nOMthOnOnNON^O—: 

mm 

fi 

M 

Ch CM 0%l C*4 CM 00 D 1 r\ kl O- 

ft# 

a 

>% 

l 

t 

J 

\ 

l 

10 N ^ ’H n H 111 in-r4 
♦ ♦♦♦♦♦♦♦♦ 

in 

m 

GC 


oooooooo oo 


m 

m 


00 T-f £>. N O' IS On *-» N 
ul N 'O CM !\ T-* T-! <T N 
LiNC^^roNi\^r^rON 

h O s3 i-f iiO *h C-i liO ^ 

ooooooooo 


/•S />. /\ /s 


1 

1 »H 

I ^ XI 

I 3 m 

f H 

I C fr* 

I M IQ 

I s> 

1 

I 

! £X 3 C 

I . 

I ' 

—Jp **«* **** **#' 


Oi 
4 u* 

. 4 * 

r- 

4JLA 


isiwi 

mi 


w in in in bi in o in m in 

N N N N N CM ift CM CM CM 

in «r <r in in in in in in in 


ui m in in in o o in ui o 
cm cm cm cm cm in ui cm cm d ^ 

tH rd ri tH tH t-* 


*-* cm ro in '-o n 

. ■■■ 

, ■■■ / ■ 

.» 


o o in in in in in in tin 

in in c-4 C-i C-4 C-4 C-4 C-4 C -4 

tin m in in in in in in in 


o c in o in o © m o 
lo m c- 4 . in c -4 m m c -4 in 

t- 4 T-f 1 1 " H ’ r ” i 


CO (N Or! C-i M <T lil 

T-l T-4 T-i T-f T“* T-t 




. 

- 


:yf^ 



171 


suggests termination at the third sequential stage with the 

(o) 

decision, that M i s the adequate model for the system simulated 

( 0 ) 

through M of equation (6*2«3)* On the other hand, Hill et 
al, could stop the Box Hill procedure after designing 11 

additional experiments, when the posterior probability of 

(p) 

M rose to 0.9996. At the same time it is worth nothing that 
through the Box— Hill procedure one could declare as the 

worst model much earlier i.e. after one had designed 2 additional 
experiments only. An important point as can be brought out 
through this observation is that in the Box-Hill method more 
emphasis is on bad models rather than on good ones. It can be 
seen through Table 6.2,6 that the posterior probability shows 

' . ■ ■ I 

large differences in the best and the worst models, while j 

(?) (1) 

discrimination between close rivals, M v ' and M v , is much 
slower. This is not the case with the present method, which 

'I 

on the other hand puts more stress on good models and discriminates 
sharply between the closest rivals^ as can be seen through ; 

Tabic 6.2.6. This is the effect of the weights being used in j 
the criterion function <p j the closer models are always given 
more weightage in designing experiments than the ones which | 

are already farther at the previous stage. Finally *we notice | 
in Table 6.2.6 that the points obtained through the two procedures 
are different, though both arrive at the same conclusion; namely j 
model is the correct J**49&* 




172 


6.3 discrimination among multivariate models 

6-3.1 Example s Nonli near Models > Unknown Eq ual Error 

Covariance Matrices a 

Buzzi et al. (1984) have simulated discrimination among 
the following lour bivariate chemical kinetic models through 

their procedure * 

(1) , „(1) _ a (l) t t //, , a (i) t ^ n (i) 


M vx/ i n 


1 " ^ ; y 2 ' /(l +0 3 ^1 +9 4 l 2 ) 

4 l)s i £ 2/a + eG)5 l + e(Dg 2 ) , 


„(2) , 

■n^ . 

« ©2^ 5 l g 2 

/ 

(1 4 

• + 

e (2 h ) 2 

e 4 V 


4 2) - 

' 8 2 2) 5 1 1 2 

/ 

(1 4 


9 


”i 5) ■ 

■ 8 P )£ 1 £ 2 

/ 

(1 4 

■ #h z ) z 




■ e 2 3)| l 5 2 

/ 

(14 

■ V 2 

9 

M (4) ! 

„w 

» C.5, 

1 1 t 

,/ 

(1 

+ 8 3 4) h + 

e 4 4) y 



m 9^ % A , 
Z 1 C 

>/ 

(1 




We take up discrimination among these models in order to 
see the potential of our procedure in identifying the correct 
model and compare its performance with the method of JBuzzi et 
So far as the data are concerned they have used ^ as the 
true model. In this connection, it may, be pointed out that the 
values, 0.1, 0.01, 0.1, and 0.01 of the parameters of this model 


173 


as reported in their paper, do not seem to be the ones which 
have been actually used for simulation of responses. Assuming 
that the data reported are correct, the values for the parameters 

(l) 

of M used in simulation should rather be 0*01, 0.001, 

0,01, and 0.001. To start with we use the data of Buzzi et al, 
(1984) in our discrimination criterion - the Discrimination 
Index and find that with 0.0134 as the value of this 
index is closest to the true model. On the other hand, the 

(Z) 

highest value 0.4040 of D.y' is indicative of the worst 

(*) 9 

performance of M'- 3 '. Besides, the values, 0,3088 and 0,2738 

of and , respectively show the closeness between 

and Mr although as compared to these prove to be inferior 

models. 


In order to make the distinction more clear we design 


some additional points, using the assumption that the covariance 
matrix of errors, under the four models, is same, thougn unknown. 


To that purpose the independent variables 2 } are consl:1 alned 

to lie in the Interval {5.0 < ^2 ~ 55*9) 311(1 criterion 

<p( E 9+1 ), utilizing h 2 of (4.3.28), is employed. So far as the 
weights for the criterion function are concerned, models M 2 and 


m ( 4) belflg olosest at the 9th run are given the highest weightage, 
0.361349, while M (l) and M (3) , being the farthest, receive the 


least, i.e., 0.013503. The criterion yields (54.0 , 28.0) as 

^ , 4 Fnr the purpose of simulating the values 

the new optimal point. *or we p 

of the responses y-^^ we use model 





175 


stage that M (4 > with D< 4 > „ 0.0916 is closest to M (l > and that 

with = 0.6790 may be considered to be a bad model. 

The progress in discrimination in the sequential procedure can 
be seen through Table 6. 3.1« Besides Figure 6.3*1 is presented 
to see the trend of discrimination. Table 6,3*2 shows weights 
used at different stages and Figure 6.3*2 presents a picture 
of the change in weights according as the relative position 
of the models changes from the first stage to the second. 

Finally, we test the adequacy of the selected model, i.e. , 


M* I 


0.0099 

Yl (1 + 0.0991 + 0.000854 t 2 ) 

0*00098 SI 

V _ *g ■> ■* < mm * 

2 (1 + 0.991 + 0.000854 g £ ) 


Using the test proposed for this purpose we find that the value 
of the statistic V (s 0.005) is much smaller than the 
corresponding 5% point of the Chi-square distribution 
<*1,0.05 - 7-815 ), thereby establishing that M* is an 
adequate model for the simulated chemical process. 



Table 6.3.3 
simulated by Buzzi et 
Since the 
assessed 
do not show 
values, 15*1* 


discrimination $ 
the same set of models, 
procedure, is 
set of data 



176 


Table 6.3.1 

Suau@nt.ial discrimination amonsl nonlinear 
bivariate models } the present procedure 
(True model 5 jj(l) ) 


4 ‘ •- 

! R 

1 

! n 

I 

I k 

1 tflf-lji 

i variables 

* 

j *1k 

’ Responses 
* 

1 

i y 2t 

i 

• 

i 

i 

i 

* 

i 

« 

i 

i 

i 

i 

Discrimination index 

4 ” ”f 4 5> », w 

V 


T" 


T 


"1 “ 




i i 

I 20 

20 

i 

* 3*61 

0.53 

» 

i 




I 2 

1 30 

20 

1 5*42 

0*44 

i 

» 




! 3 

* 20 

30 

j 5*00 

0*64 

i 

* 




* 4 

i 30 

30 

I 7*50 

0*66 

i 




1 « 
i %s 

i 25 

25 

i 5.73 

0*55 

> 




i 6 

1 25 

15 

1 3*80 

0*33 

i 




* ? 

l 25 

35 

S 7*30 

0*79 

i 

• 




; 8 

: is 

25 

1 4*90 

0*35 

i 




I 9 

1 35 

25 

1 5*90 

0*71 

* 

i 

0*0134 

0 * 3088 

0.4040 0.2738 

! 10 

I 54 

28 

1 9*66 

0*61 

i 

0*0004 

0*3044 

0.4962 0.1990 

1 11 

! 17 

43 

I 6*30 

0*52 

* 

4 

0*0000 

0*2485 

0.6163 0.1351 

I 12 

1 8 , 

55 

1 8*89 

0*34 

i 

1 

0*0000 

0*2293 

0.6790 0.0917 


Table 6.3.2- 

Weights used in the criterion function 0 at different 
stastesfdiscrimination amonS nonlinear bivariate models 



0.000025 

0,000010 




I 8»aue~ 1 

Model 1 


Model 


Ir.tial 1 

v ! 


u 


t 

i k 1 

I 

I 

1 

2 

3 1 






i i 

i i i 

j i 

i i 

1 

2 I 

3 f 

4 1 

0.017669 

0.013503 

0.019924 

; 0.311402 

0.361349 

0*276152 

: ) 

i 2 ! 

I i 

1 1 

2 f ' 

3 . i 

4 1 

0.000719 

0.000441 

0.001101 

0.367036 

0.390869 

0*239833 1 



DISCRIMINATION I M3 EX 


m 


T 



i 


RCDEL 3 




12 3 4 


SEQUENTIAL RUNS 


FIGURE 6.3.1 Performance of the proposed procedure in 
discriminate among nonlinear bivariate 

models. True model » Model 1 . 



iWlfiiii 


*■ X 


First stage weights 
Second stage weights 



FIGURE 6.3.2 Change in weights w u>v . k> from first 

to second stage ; nonlinear bivariate 

models . 


179 


U -P 

m m 
m 

C *H 
•H M 

r*t H 

C 3 
O « 

C 

Ul o 
c 

o cn 

gf #*% 
*0 



1 

<r 

N3 

8 

4 

CO 


co ro 

u 

1 1 

♦ 

* 

♦ ♦ 


♦ 

<*• ■ ♦ 

•rt 

CM ^ | 

<r 

in 

ro cs 

<r 

f- <5 

r -4 in 

m 

*rt 

4^ 

X i 

1 

l 

i 

-r-4 

T-| 

r-4 r-4 

ro 

c c ro 
v c 

-r-i 

T? 4^ 

ID N) 

ro 

i 

ro 

ro 

/-s. /N 
— * •w- 

o 

& m ro 

0 

0 

43 

i 

♦ 

♦ 

♦ * 

4- 

oi t ♦ 

♦ ♦ 

$ 

CM KN i 

<r 

CO 

00 TH 

t4 

ro -h in 

hi CN 

dJ 

t- 

ro 

X 1 
\ 

r-4 

in 

-o o 

t4 

W 

T-i 

J- -Pri 

di m tH 

c & 
o 

CD in 
t 4 r -4 

3 


N 

iio 

r-4 o 

in 

xn & sc 

0 0 

0 


♦ 

♦ 

<*• ♦ 

♦ 

in ♦ 

■*■ * 

CO 

I 

CM CM 


sO 

r-4 t-i 

T-i 

in -h <r 

GO Os 

X 

r-l 

! 

r-4 

N 00 

CO 

43 0 -r-4 
r d; rt 

*TJ 

O T-i 
*r4 T-i 


Cn O CD CD 
LO 00 tH -.h 

t-i tH r -4 r -4 


m ro in 
r-4 ro to 


<r <r ro n ii' o i 
o <$* ro in rs i 




O-rHOOOOr-Er^O 


lil^OO 
h NO C*4 


liVN N O O C i 

<r c-i <r o in <r i 


Os ^ ^ G) O fO H T; cc 


i 3 H 

| fiL «» 

l C -H 

i , w f~ 


I a: a 

f 1 . 

i~ - 

' -'If- 

iHH 


COOOOOOOOOO i 

r -4 in in in m in in in jn 5 ! 
ro in in in y; fe- 1 a<? 


00^000000^0 - i 

111 If) O 'C IH ID jQ “2 ° 2 | 
in in t- 4 u- Sia t-. T-i i 


4 ' j ’ ©3* V -■ ■ . f |- 




' 

iPrMSg ¥«,3 


OHCiiotfn'ONffi J I 


4i^®lii#iiiiiilli 



180 



(with known Z), used by them for this purpose »indicate that all 
the models are adequate. The addition of the responses 
( 9 . 15 , 0 . 93 ) corresponding to the setting (55.0, 32.8) of £ l0 

(■x) 9 ~ 

renders as an inadequate model with X^ = 58.3. According 

to their scheme this model is dropped for the purpose of 
designing the next experiment. In fact, continues to 

be inadequate throughout the sequential procedure, as can be 

m (n) 

seen through Table 6.3.3. Similarly, M Vl , too, once dropped 
after 11 experiments never got a chance to be included in the 
design criterion. This way, only and participated 

in designing new experiments. These models continued to be 
close rivals till the 19th experiment, when this procedure 
is stopped according to the prescribed condition. The model 
is declared as the correct model. 

It can be clearly seen that this decision which could be 
possible after 19 experiments through the procedure of Buzzi 
et al. has been made after 12 experiments only, through the 
procedure proposed in this work. In addition to the faster 
convergence, the present method has also an advantage of 
handling situations when I is unknown, as has been assumed 

in the present case* 



181 


6*3*2 Example * Nonlinear Modelsi Unknown: Unequal 

Covariance Matrices of Errors 

In this example we consider an hypothetical biresponse 
system for which the model : 


M 


(0) 4 ^(0) _ o (0) rt (0) 


e 


hh't* 4 D) 


f - f (6.3.2) 

with * 0.0005, 0^ O) = 0.16, and 0^ = 15, is the true 

model. So far as simulation of responses from this system is 
concerned we use the equations 

(0) 


n- 


+ e. 


^2 


h (0) + e 

"2 2 * 


where the independent random errors e^, S£ are distributed 
normally with means, 0.0, 0,0, and standard deviations, 

3.162 xlO~ 6 and 1.0 xl0~ 4 , respectively. This set-up for 
generation of data is the one that has been used by Hunter and 
Wichern (1966). The data generated through is planned to 

be used for discriminating among three bivariate models; 


l^(l) 


V 


n 


p-) m 0 p^ / (1 + Oj^l + ^ 2 } 


n 


(1) _ 

2 ” 

(2) _ 


n 


1 

(2) _ 


4 1 ^ 1 ^®4 1 ^l S 2 / (1 + 0 3 1)§ 1 * 0 4 l)g 2 )2 » 

e< 2) e< 2) ^ 2) M2- / a + 4 % ♦ 4% )2 

+ ■ 4 %$ / (l + + e 4 ^ ' 


IS 

S, ,1*1*1 


■ - ; : ' ■ , 



182 


Table 6*3*4 

Discrimination index and Posterior probability in 
discriminating among nonlinear bivariate models 
(True model i u( ; 3) ) 


R 

Input 

! Response 

Discrimination cr 

iteria ! 

u 

Variables 





> 

n 

I 

i 


1 

! 


<4° 

4 2) 

k 

1 

* 1 * \ 

k 

*ik 

l 

I _ 

^ 2k 

*ik 

y 2k 

- 4 15 > 


(P^)) ! 

I 

1 

1 

! 1.0 

2,0 

1 

! 0.00094 

0,01861 



I 

1 

1 

1 

1 

2 

4.0 

1 

2.0 

! 0.00098 

0.00520 



1 

3 

1 1,0 

3.0, 

! 0,00140 

0,02820 



I 

1 

1 

4 

! 

1 4.0 

3.0 

1 0*00148 

0,00780 

0.5578 

0,4415 

1 

0.0007 ! 


! 




(0.0390) 

( 0 . 0000 ) 

(0.9610) : 

5 

1 5.0 

3,0 

1 0.01480 

0.00620 

0.5786 

0,4214 

o.oooo 


t 

« 

l 


1 

1 


(0.0010) 

(0,0000) 

(0.9990) i 

t 

1 

+ 



Initially, four experiments are selected according to a 2 2 

factorial design and is used to simulate observations on 

the responses y 1 ,y 2 , as shown in Table 6. 3’. 4* These values 

when utilized in (2.5.27) and subsequently in (2.4.1) yield 

0 . 5578 , 0 . 4415 , and 0,0067 as the values of the discrimination 

index for M^ 2 \ and respectively. These values 

( *) 

show enough evidence in favour of M However, we design 

some more points. To that purpose the independent variables 
are Cons ‘ traine<i within their operability region, 

{0 < £ 5 ). Before we proceed to design the 5th experiment 

we must decide on the weights. In this connection we notice 
that the weights calculated through the strategy proposed here 
are appropriate, in that, the pair consisting of models ^ and 


M^ 2 ^ , which have been found to be very close to each other, 
is given the highest weight of 0.996562-. Since other pairs, 
l.e., M^) and (M^ 2 \m^) consist of almost equidistant 

models, there is not much difference in' their respective weightsj 
being, 0.001519 and 0.001919. 


Since the covariance matrices in the present case have 
been assumed to be unknown, we use h^ of (4.4.16) m the 
criterion function ^ maximization of the resulting 

function gives (5,3) as the next wttlng of the experiment «hich 

i n nm i, a cma o 00622 as the values of the two 
in turn produces 0*00148 ana u.u o 



responses, through M (0 >. The values of the discr imina tion 
index which results from the sample rf (4+1 ) observations shows 
that while the gap between M (l) and M (z) is widened, model M (5) , 
too, is further pulled apart from the other residual models, as 
can be appreciated through the values of D^ u) , u = 1,2,3, given 
in Table 6.3.4. Besides, the low value, 0.12289 x10“*^ of 
establishes the superiority of M^, At this stage a look at 
the weights for the 6th experiment, if req uir ed, clearly suggests 
that need not be further discriminated from and 

since the pairs and require as low 

weights as 0.000003 and 0.000004, respectively. This fact, 
together with the low order of suggests the termination 

of the sequential procedure. Thus with the help of a single 
additional experiment M w/ is fairly well identified as being 
the best model. 


Hunter and Wichem (1966) have used the Box-Hill (1967) 
procedure in simulating discrimination among the three models 
considered in this example. Based on the same preliminary 
design and using the identical hypothetical system, the posterior 
probabilities of the three models at two sequential stages are 


listed in Table 6.3.4 as bracketed values. In fact, after 6 
experiments no further discrimination is required as the 
probability of one of the modelsf namely, M has risen to 
0.999 .which carries enough evidence in favour of this model. 

In this example the convergence through the Box-Hill procedure, 
too, happens to be quite fast. However, through our procedure 
we could still save one run of the experiment. 


185 


Appendix a 


COMBINING TWO QUADRATIC FORMS 


Lemma A,!, If CL and Q ? are two quadratic forms in xj 

Q i = i = 1*2, x,v. as r X 1 vectors and 

as the inverse of rxr symmetric matrix M. , i = 1,2, then 


Q 1 + Q 2 


(x-w)* M^'Cx-w) + (v,-v 9 )‘ (M., +M 0 ) _1 (v-r —Vq) , 

** <*+ rs> in/X *%/£. X £ /vX f^/£ 7 


where 


’ w - (m 1 + m 2 )- 1 (h 2Xi + m^) , 
M* » M^ 1 (M 1 +M 2 ) iCj 1 


(A-,lvl) 


(A. 1.2) 
(A. 1.3) 


Proofs 


Let MT 1 = A ± , i = 1,2, and A = A ] +A 2 
Let (A-^+A 2 ) ^ (=A"“^) denote the inverse of A^+A 2 (= A) 


Then 


Q^+Qg can be written as 


Qi+q 2 - (*-&) + (r3J 2 )' A 2 ( S-x 2 > (A - ll4) 

Simplifying and rearranging the terms on the right hand side 
of (A. 1.4), we get 

Qq+Q 2 - x’A x - 2x* (AjV^ + A 2 y 2 ) + v^ + Z 2 k 2 ~2 

. x' A x'- 2x' A w + w' A w + + X 2 A 2 K 2 - F A X 

= (*-»)■ A(x-w) + ^ Ai vj + v' A 2 v 2 - w’ A w, 

i , ..it.. . J,,’ ' . . . * ./‘A . 








186 


where w = A*" -1 (A-,v_ + A^v„) 

~ 1~1 2~2 

But 

' A-l « ,-l 


w' A w = + A^) A- A A -1 (A^ + A^) 

= + A x 2 ] ^hAvj-A^-vp] 

= ( xi-x 2 >’ V' 1a zi - ferx 2 ) ' a i a ' 1 a 2 ( Xi-x 2 ) + 

x 2 A A_1 A xi - x 2 A A_1 A 2 ( xrx 2 > 


( xi-x 2 >' ;l lXi - (xi-x 2 )' A i A ' lA 2 (xi-x 2 ) + X2 A xrX2 A 2 C Xi-x 2 ) 


Thus 


w* A w » Vn A- v-i + v. A, v. - ( v -v ) ‘ A A” 1 A 0 (v,-v 0 ) 

*V #%/ ^X X »N/X 1 rv/X X <^X 


Substituting (A. 1.6) in (A. 1.5) we have 


(A. 1.6) 


V°2 - + ( Xi-X 2 )' a i a ' 1a 2 ( Xi-X 2 > (A.1-.7) 

When we reconsider the substitution A^ = MT 1 , i * 1,2 

we also use 

A * (M^ 1 +f^ 1 ) and A” 1 = (M^ 1 +^ 1 )“ 1 

Besides, the symmetry of MpM 2 and consequently of and M" 1 
enable us to use the following relations 


M. 


(M 1 +M 2 r 1 M 2 = (M^ 1 +M^ 1 ) 


lv-1 


and (M^+fC; 1 ) = f^ 1 (M 1 +M 2 )]^ 1 . 
As a result we have 

A = '» 

A” 1 « M 1 (M 1 -riy[ 2 )~ 1 M, 


(A.1-.8) 



187 


and from (A. 1.7) we get 

Q l +Q 2 * + (M 1 +M 2 )“ 1 (v 1 -v 2 ), (A.1-.9), 

where 

w - (A 1 tA 2 ) -1 (A^ + A 2 v 2 ) 

- (M^-tC 1 )" 1 (tf^ + mjV 2 ) 

- VW” 1 M 2 (^ 1 v 1 + 2 ). 

i.e. , 

X - (M 2 v x + MjV 2 ). (.&. 1 -. 10 ) 

This proves the Lemma. 


Corollary A,l We now replace the vectors Vj_,v; 2 and the 
matrices by scalars x,v lf v 2 and m 1 ,m 2 , respectively, 

so that will be replaced by l/m i (m i £ 0) and 
Q i a (yyv^/m^ i = 1,2. As a result we have 


Q^-fQg 


iaadf . (v g £ 


nr 


m l m 2 


where m* *» 


m l m 2 


and w 


m 2 v l + m l V 2 

m^+mg 


(A. 1.11) 
(A. 1.12) 



188 


appendix b 

SOME RESULTS OF MATRIX ALGEBRA 


Lemma B.l Let A^ be an rxr positive definite symmetric 
matrix and A^ be an r xr non-negative definite symmetric 
matrix. Then for sufficiently small a 

oo b , 

log |I - a A 1 A 2 | = -F Sjj. tr [ (A 1 A 2 ) b ] . (B.l.l) 

Proof * 

Let Ap, s M # 

Case 1 * Suppose A^A^ «= A^A^, 

Then A^ and A 2 being symmetric, M Is also symmetric and we 

can write 

(I-aM) = T[(iiag{l-ar 1 )(l-ay 2 ),...,(l-ar r )}]T , , 

where * yg» • ♦ • » y r are the si-S 611 values of M, T is some suitable 
transformation such that T' T = I, and I is an r x r identity 

matrix. 

Therefore 

r 

| I-aM | U (l-ar-iT 

i=l 

and 

log | I-aM | 

Since a is 
log (l-a^ 




189 


v 00 b 

log I I-aM | =->- [>- “ r b ] 

13. T53 b 1 

00 b , 

« -"> TT tr(M b ) , 

b , 

i.e. , log II-aA-jA^I = -> \ tr [(iL^n. 

Case ii If M(= A^A^) has distinct eigen values, then we can 

write 

(I-ctM) . T[diag{(l-ai i )(l-ctr 2 ),..,(l-ar r )}] I~ l , 


where the matrix T can be formed by taking the eigen vectors 
of (I-otM) as the columns of T. 

Case ill If the eigen values of M are not distinct then 
using the Jordan Canonical form, we have [Bellman (1977)] 

I-cM = X[dlag{L nii (r 1 ), L n9 (V>-"> L ns (r s )}]:r1 ’ 


Ill2 


where 7^ is an eigen value of multiplicity m^, j = 1, 2, ...» s, 

g 

with r « >” and L (y.) is an m.x m. matrix, given by 

^ 3 vJ \J xJ 


L ( r .) 

m 7 j 


Therefore , 


(l-ar..) 1 


0 


(l-ay •) 1 ... 0 

J 

,*■ * * 

■■■§ * • • * • 


0 

0 


0 . , 


0 


. . . # (l-a y.. ) I 

0 ... 0 (l-a y.) 

J 


|X-aM| = Tf (1-ctrP a . 

■"‘j 


— 

'■ r ! A. . 



Taking logarithm on both sides, we get 


log 1 1 — aM 1 = > m. log (l - a y.) . 

0=1 a ^ 

Since a is sufficiently small we can expand log (l - a r.) 

and obtain 


00 b , 
a b 


i.e. 


log |I - aM| = - m. > ^ y • 

d=l J b=l D 3 

00 b s h 

-- k & "3 r 5 • 

00 b 


dog j I ** ^ ~ "’j—- %" tr t ^ * 

Corollary B.l If a-^ and a 2 are two numbers such that a 1 > 0 

a 2 °* then from (B.l.l) we get 

log (l-aa^a,^) = ->~* (a^a^,) 13 , (B.1.2) 

for sufficiently small a > 0. 

Lamma B.P If B ± and B 2 are two matrices such that B^ = B^ 


than 


<V*2>‘ - £ ( b a ) Bj =f>. 


(E.2.1) 


Proof : By induction. 

The result is obvious for a = 1. 
a s 2 5 


(B 1 +B 2 ) z = b£ + Z\\ + B 2 


When 



191 


Let the identity (B.2.1) be true for a = k. Then 

(B 1+ B 2 ) k = 5* ( k ) B k Bf* 

- B K-lK*V • lB 2' a+1+( d)Bj B 2 -3 

+ « . . + B , 

Multiplying both sides by (B^B^we get 

( B l+B 2 ) k+1 « 

B l +1+ (k-l) B 3 B 2 + * * * + ( j -i )B i B 2” d+1+ ( j ) B i +lB 2~ d + '** + B 1 B 2 


+ B 0 B k +( k i )B k B^ ( k ) Bf lB ^ +2 +fiBjB^ +1 +...+B: 

2 1 v k-l 1 2 \-i~l 12 ,1 1 2 


j~l ' 1 2 


Collecting terms of like powers, we get 
(B 1 +B 2 ) k+1 . 

* )] =&♦... ♦[<£>♦#] B ?2" 3+1 + - 


+ t(H> * *A + B 2 +1 - 

This shows that the coefficient of the general term (i’.e. the 
jth term) in this expansion is 




192 


showing that the result is true for a = k+1. 

Hence, by mathematical induction 

( W a - ^ 0 B f b - 

Corollary B, 2 If the matrices and are replaced by the 
scalars and b^, respectively, we have for the positive 

integer a 

(b 1 +b ? ) a = T" (?) bf b*" 1 . (B.2.2) 

12 J5Q i l 2 

Lemma B,3 If and B^ are two matrices such that B^B^ = B^B^, 
then for sufficiently small ocpOCp 


OO °0 , n 

(X-tt 1 B 1 -a 2 B 2 )" 1 = r El ( a ) 


(BV3.1) 


Proof i Let a-jB^ + a 2 B 2 = A * 

Then (I-A)” 1 = I + A + A 2 + . . . +A^ 




OO 

r* > A^ . 


But, - ( ai B i+a 2 B 2^ can be e3< P anded using Lemma B#1 S ° that 

- 5^ (2) B k Bf k . 

JV J- £» J- *— 

Consequently, we can write (I-A) 1 as 


- / i \ k ■phni - k 

(3>A) 1 * > > (£) B 1 B 2 

j=0 K=0 

and if we let k = a and j-k = h, then 

00 “ .OXb, 


(I— a^B-^-ct 2®2^ 


TT 


3, 13 rj£L-nb 

r a I a 1 « 2 B ! B 2 - 

iisv 



This proves the identity (B. 3 . 1 ). 


Corollary If and are any two scalars, then for 

sufficiently small a ^>“2 


(l-o: 1 b 1 - 


“aV 


= > > 
i=0 5=0 


W ^ 


(B.3.2) 


as can be obtained from (B.3.1) by replacing the matrices 
I,B-, , and B p by 1 , b-, , and b p , respectively. 

'mm JL 



194 


APPENDIX C 
JOINT CUMULANT GENERATING FUNCTION 


C.l Univariate Case 

Lemma C.l, Let Y be a normal random variable with mean u 
and variance X , Then the joint cumulant generating function 
for and Q^j = X^ i = 1,2, is given by 


kCw 3 ,w 2 ) [2 a ” 1 (a?-l) 1 (l+aad?) (-^=) a ] -~ 


+ SZ [2^*^ (b-l) 1 (l+bad^) (*~) b ] 


X 2\b n w 2 


”oo 00 

+ >“ >” [2 a+b “ 1 (a+b-2)l { (a+b-l)+a(a-l) Xd 2 + 

•8BL 1 

X, X- . wf wl 

b(b-l)X + 2abX d^} ^ ~£\ "^T * 

(C.l.l) 

for some real WpW 2 , where &^ = n - i = 1,2} X - X^+X^ 


Proof i Since Y is distributed as N^X** 1 ) the joint cumulant 
generating function for ^ and by definition, is given by 



) « log [Cx/27t> 




where Q 



195 


Now, 

w l®l +v ?^2"’ ” \ { 2c l (y-4-L ) 2 +2 c 2 (y-M^; 2 - (y-M) 2 } , 

wAj 

where c.^ °=» - , i = 1,2. 

This can also be written as 

w l Q l +x 2 Ct 2 " \ Q = ~ \ ^ 6z2 ^ + 2X(c- L d 1 +c 2 d 2 )z + (c 1 d ^ +c 2 d 2^ 

(C.1.3) 

where 6 * (l^C-^Cg), z = (y-juD, and d i = ,i=l>2. 

Substituting (C.1,3) in (C. 1 . 2 ) we get 

k(w-, ,w P ) * - *i log 6 + log [(X6/2 j0 1//2 

TON* fiass 

// ex£ {2X(c 1 d 1 +c 2 d 2 )z} exp (- \ X6z 2 )dz] 

j^L 

+ xCc^ + ^2^2^ # 

6 * f 

kCw-^jWg) *■ — "ij 1°S 6 + ^X( c id]_ +c 2 d 2^ d + ^ (°i d i +c .2 d 2^ * 

(C.1,4) 

Now, 

c l d l +c 2 d 2 “ ^ c l d l +c 2 d 2^ ( 1 *2 c 1 ~2c2) 6 

= {c 1 d^+c 2 d 2 -c 2 d 2 -c 2 d 2 - 2c 1 c 2 (d^4-d 2 )} 6 \ 

So that from (C.1,3) we obtain 



^(W-j^Wg) 



196 


Using equations (B.1.2) and (B.2.2) from Lemma B.l, we can 

express the first component of (C.i.4) as 


log 6 


*» rtfikr-1 a „ , „ . 

V'- 2 XT f a ) ^ a “l 

> >_ C 1 c 2 


a 


13) 


do 


>" 2 C 

a51 


1 r°2 . v" 2 ° /a+b\ a b C 1 

X Ef + > ^Pb ( a > c l c 2 + “a J 


b=l 


P 3 -- 1 a — 2 b ~ 1 b x v- ~ 0 a+b-l (a4T>l)l a b. 
2 ~ a - > c 0 + > >_ 2 a! b! c l c 2* 


551 


a 


C- + ^ — c- u 9 T £— ‘ 

1 b*^ b 2 a=l b=l 


al bi 12' 
(C.1-.6 ) 


Also, we have, from relation (B.3.2) of Lemma B.3, 


a " 1 _ ^ ( i* 3) 0 i c d 

1*0 

the other components of (C. 1.5) 


through which we can express 
in the form of power series, e.g., 


Kc i d i 5_1 -K % 21 <r p 


1 c i + l (h 3 ) 4 ] 


oo m n JiSL -=2^» ^4-h—l (a+b-l)^ 1 

M?tSl 2 =1 + ^ h 2 1 2 

Si a=1 c=i (ca>7) 


Similarly, 


2 -1 
hc 2 d 2 6 


df [> 2 

u 2 L r^i 

D~JL 


ib " 1 Co +■ - 


- g- 2 a+b-l 
a=l b=I 


11 c^]. 


(C.1.8) 



Besides, 


X c 3 c ? (d 1 -ci 2 ) / - 6~ x a X (d^+d 2 - 2d 1 d 2 ) x 


00 oo 

EE E 2 


a+b ~2 (a+b- 2) 1 a^bn 
U-Ul (b-l)! C 1 C 2 J ’ 


a=l b=l " (c.1.9) 

Substituting (C.1.6) through (C. 1 . 9 ) in (C. 1.5) and rearrangi' jrx ®^ 

the ter, ns appropriately, we get 


0 a — 1 0 _ 00 0 b-l p >, 

k(wi,w 2 ) * ^ - (l+aXd£) c 1 + ^ “TT~ ^ 1+b Xd o)°' 


TO TO 


A 2 y 2 


a+b -1 (a+b- 2) 1 

hh y y 2. 0 1 h > 

a*X Dal 

{ (a+b~l) + a(a+T>-l) + b (a+b-1 ) - &b} 

Simplifying further and substituting back c ± = (w^Vx, i=l* ^ 

we finally have 

a 

00 — 0 X-l O- 

k(wi»w ? ) - >T [2 a (a-l)l (l+a Ad 1 ) (-y) ] 

x * a»l 


to o ^ o *h y? 

+ T- Ea^Cb-l)! (l+b Xdp (-5^) 1 b! 

+ >- [ 2 a+b “ 1 (a+l>-2)l { ( a+b-1 ) +a (a-1 )Xd|+b (l>-l 

*«#»* ««■«** 

&»X b=l 




a b 

W n W,- 


A V w n P|| 

+ 2ab Xd-jdg} ^ "af bl ' 


(C'.lv ^- 0) 


, , -j ^ sate 

The function k(w lf w 2 ) given by (C.l.lO) can be 

1 2 ' - m+ k is the coefficient -° f 

any required cumulant s the cum 1 “ao b bl) • d. 

(w?/al) # the cumulantvli&i^'-fche coejfxcl^'^f- 


cvj c\J 


the cumulant ^ is the coefficient of (w^aibl) in the 

above expression. Thus 

kao - 2 a “ 1 (a-l)i (l + a\d2) (^i) a , (C.i.n) 

fcob “ 2 b " 1 (b-l) 1 (l+b Ad|) (-“) b , (C.l-,12) 

kab “ 2 fl+b " 1 (a+b-2)l {(a+b-l) + a(a-l) Xd^ + b(b-l) Xd^ 

+ 2ab Xd^X,} (-^) a (^) b , (C.1.13) 

where d i * (juuju^), i = 1,2. 


C.2 Multivariate Case 


Lemma C.2. Let Y be a random vector having an r~ variate 

M '--T»»rrTr.,:— ■(■■■:, nfliriTiwn rs> 

normal distribution : N r (M, A - * 1 ) and let Q^,Q 2 be two quadratic 
forms in Y such that Q, = (Y - M,)' A (Y-H .), i = 1,2, with 

tst X rv/ *0 X JU r*J rs/X 

Ai»A 2 as symmetric matrices and A = + A^. Then the joint 

cumulant generating function of and Q 2 is 


a 


k(w lt w 2 ) 


OO 


El 2 a “h»-l)l * ad^ A" 1 

a«l 


W-, 


•K W 0 

D J 1 2 


+ > 2 b " 1 (b-l)I {tr(2 2 ) + b d^ AZ d 2 } ^ 

S=1 


00 00 


+ > ~ 2 9 ’ +b "' l ''(a+b-2)l { (a+b-l )tr(Z-Z 2 ) + 

551 B51 



199 


where & ± « (£-&), i * 1,2, Z_ L = A" 1 A., i = 1,2 and w lf w 2 

are some real variables. 

Proof • Since Y is distributed as N r (d f A“^) f the joint cumulant 
generating function of and Q 2 is given by 

k(w-j >Wn) log [ { | A i (2 k) } ^ f eixp (w-j ) exp (”* ~Qvd^3 , 

JL HL 


112 2' 


for some real w 1 ,w 2 , where Q = (y-d)' A (y-d) . 


Let * V ± . Then 

w l^l + w 2^2. *“ ^ ^ “ 

<rh> Vrh* + <r •&> ' Vrfc* - i <rs> " W 

\ 

“-is* x 5 + 2( Si v i + S , 2 v 2 ) 5 + Si V 1 Si + Si v 2 » 


(C.2.3) 


where 2 * jg-JJ , X = A - 2^-2^, and ^ = d - i - 1,2. 
This modified expression in (C.2.3) when used in (C.2.2) gives 


1 


k(wi fWg) m ** "2 


1 log Our 1 ) + 2 (div ^’ 2 v 2 ) r^d^^Vg)' 


(C.2.4) 


+ SPlSl + 

If we write . d£(A - 2V r 2V 2 ) X" 1 ^ , 1-1,2 

a few terms cancel with each other on the right hand side of 
(C.2.4), leaving behind 

kCwi.wO - - i log |zl + d{ ft V 1 + d’ n r 1 u 2 d 2 

- 2{(<Jl-fc)‘ flU 2 rlU ^ ( ^2 )} ’ (C ' 2 ' 5 





where 2 IL = A” 1 Y. t i = 1,2. 


We now express all the components of (C.2.5) in the form of 
power series by means of Lemmas E.l, B.2, and B.3. Firstly, 
using the result (B.l.l) of Lemma B.l, we can write 

00 ,f > »l 

-i log 1 z 1 -2— tr(U. +U ? ) a . 

* a=l a 1 d 

Since A = A^ + A 2 811(1 A-^ A” 1 l\^ is symmetric »we have 

« ^2^1’ can * ^ here ^ ore » tether expand (U^+U^) 8, 

by the result (B.2.1) of Lemma B.2 and get 


«, a-1 

V « C 

a»l 


00 8 . 

2: V- ^ 0 tr(uju a - b ) 

a»l a b=0 D 1 1 


tp(uf)+E 4^ tr(U^) + EE 2^1 i|±^itr(U^) 


a--l b=l 


(0.2.6) 


Further by Lemma B.3, 2” can be expressed as 


z _1 - >“ r 2 i+3 (l 3 ) . 

£3D J3D 11 


(C.2.7) 


so that 


■W . d! A CE 2- 1 #ng, 2 VS 1 ) 


a « z u i3i - & 


E, a i " Wh sa 2 cafiTTcr 

a=l , a 


(c.2.8) 


Similarly, 

dl ft 2“ 1 U„d„ -FI 2* 


a+b-l jGIfelird’ AUfU^do 
a! tb-l Jl ~2 1 2~2 



Utilizing the same result, i.e, ,(C.2,7),we also have 

(&-fe) ' A VVsrfe) - 


on m 


^a+b-2 (a-fb— 2) i /, , u T T ar T h / ^ , \ 

£Zl CT ta-l/l (b-lj 1 fef&P A U 1 U 2 ^~1~~2 '* 

(C.2.10) 

Substitution of (C.2,6) and (C.2.8) through (C. 2. 10) in (C.2.5) 
and re arrangement of terms further yields 


<» „a— 1 

k(w 1# w 2 ) * >_ -^ 5r - {tr(uj) + ad^ A * 

a=i ~ ~ 


_*L ob-1 


{tr(U°) + b dj A U^d 2 } iv 


El E 2 a+ ^ 1 {(a+b-lHrft^) * 


a«l o»i 


A ^(ad^J-adi A U^-M^ A U^} 

(C-.2-.ll) 

Finally, on substituting back = A 1 V jL =w i A 1 A i , i = l,2, 
we obtain the result of (C.2.1), i.e., 


kCw-^wJ - >.„ fe^Ca-l)! {tr(^)+ad-[ A ^11 + 


00 “K “h 

« 5“ [a^hb-nt {tr(z*)+b dj A + 

ra 

>T“ £ 2 a+ ^"-^ ( a+b- 2 ) f { (a+b-l) trCS^S^) + 

a 3. fel 

Sr-b/ _ j A *“ 



202 


where d ± - (d - jx ± ) and Z. = A” 1 A. , i « 1,2. 

Remark * The cumulants k aQ , k Qb , and k &b can be obtained 
from the function k i* 1 CB.5*12) by separating out the 

coefficients of (w a /al), (w^/b ! ) , and (w^W^/albl ) , respectively, 
for example, the joint cumulant k &b is given by 

k ab = 2 a+b ~ 1 (a+b-2)l { (a+b-l)tr(Z a Z^) 

+ (adn+bd^) 1 A E5z^(ad 1 +bd 0 ) 

- ad^ A I?E p cL - bd* A E a zV i* 

*vUL JL r^C. X C.r^jC~ 


(C.2.12) 



203 . 


APPENDIX d 
SOME USEFUL RESULTS AND FORMULAE 

D.l Gamma Function s 

00 

r (a) = / y a "*^' e:xp (-ay) dy, a > 0 


o 


D.2 Bata Function in terms of Gamma Function : 

r(a) r(p) 

B(a,P) * 

r (a+|3) 

D.3 Inverse Gamma Function ; 

r(a) = b a - f° y- (a+1 ) eX p (_b/y) dy, a>0,b>0 
o 

D.4 An r-Variate t-distribution [Girl (1977) 1 '* 


(D.2.1) 


The r-random vector Y is said to have an r-vector Y is 
distribution, to be designated as t (^,1,^) if its p.d.f. 

x /v 

is given by 

f Y (y) = 


r(^) 


{ r(i)} r r(») v r / z 


z 


- 1/2 


[1 + (y-a)'l- 1 (^b)]- (i ' +r)/2 , (D.4.D 

M (V n>*c*r 


where E is an r X r symmetric positive definite matrix. 



204 


I>.5i' Inversion Formulae [Cook (ig5i) ] : 


The following formulae are appli able to cases a+b < 6 
and given ju^ in terms of ^ for a J> b only; interchange 
of suffices will give h- ba 

M 10 " k 10 ’ 

2 

^20 “■ ^ 20^10 ’ 


^11 53 k 11 + k 10 k 01 ’ 

^30 = k 30 + ^20 k 10 + k 10 ’ 

^21 = k 21 + ^20 k 01 + 2k 11 k 10 +k 10 k 01 ’ 

^40 = k 40 + ^30 k 1^ 5k 20 +6k 20 k 10 +k 10 * 

= k 31 +k 30 k 01 +5k 21 k 10 + ^ k 20 k 11 + ^ k 20 k 10 k 04' f ^ k 11 k l0 +k l0 k 0l’ 

M 22 = k 22 +2k 2l k 01 +k 20 k 01 +k 20 k 02 +2k 12 k l0 +2k 11 + 1 k 10 k 01 +k 10 k 01 +k 10 k 02 ’ 

^50 “ k 50 + ^ k 40 k 1 0 + 1 0 + 1 0k 30 k 1 0 +1 ^^2 0 k 1 0 + 1 0k 2 0 k 1 0 + -k 4 0 1 

W 41 “ k 41 +k 40 k 01 +4k 31 k 10' 4k 30 k 11 +4k 20 k l0 k 01 +6k 21 k 20 +6k 21 k l0 
+5k 20 k 01 +l2k 20 k 1 1 k 10 +6k 20 k 10 k 01 +4k 1 1*l0 +k 10 k 0l * 

= k 3 2 +2k 31 k 0 1 +k 30 k 01 +k 3 0 k 0 2 +5k 2 2 k 1 0 +6 ^1 k 1 1 +6k 21 k 10 k 01 

+ 3^ 0 k l2 +6k 20 k 1 1 k 01 + 5k 2 0 k 1 0 k 01 +5k 2 0 k 1 0 k 02 +5k l2 k 1 0 
+ 6k 11 k 10^ k 11 k 10 k 01 +k 10 k 01 +k 10 k 02 » 
h 60 “ k 60 +6k 50 k l0 +15k 40 k 20 +15k 40 k ^0 +10 4o +60k 30 k 20 k l0 + 

+ 20k 50 k^ 0 +15k| 0 +45k2o k 10 +15k 20 k 10 +k 10 ’ 



205 


^51 13 ^1 + ^o\)1 + ^ k 41 k 10 + ^\o k i1 + 5 k 40 k 10 k 01 +l0k 31 k 20 +10k 5l k iO 

+ "1 Ok^ q^ 2 t + 1 0k 30 k 2 0 k 0 1 +2 0k 3 0 k 1 "1 k 1 0 + 1 °’ 30 k 1 0 k 0 1 + ^ 0k 2 0 k 1 0 
+ 10k 2'i k 1 o +1 ^ ls 20 k 1 1 + ” 1 5 k 2o k 10 k 01 + ^° k 20 k 11 k 10 + ^ c %0 k 10 k 01 +1 ^ k 11 k 10 +k l0 k 01 ’ 


^42 = k 42 +2k 41 k 0 1 +k 40 k 0 1 +k 40 k 02 +Ak 32 k 1 0 +8k 3 1 k 1 1 +8 k 3 1 k 1 0 k 0l 
+ 4kjo k 12 + 8 k 30 k 11 k 01 + ^ k 30 k 10 k 01 + ^' k 30 k l0 k 12 + ^ k 22 k 20 
+ 6k 22 k 10 +6k 21 +12k 21 k 20 k 01 +24k 21 k 11 k l0 +12k 21 k 10 k 01 
+ 5^ 0 k o-] + 5 k 20 k 02 +l2k 2c''12 k 10 +12k 20 k 11 +2 '^O k 11 k 10 k 01 

+ 6k-> g k -| q]£q .j + 6kr, Q k Q2 + 4 k 1 2 k 1 0 + 1 2k 1 iNo 48 ^ 1 k 10 k 01 +k 10 k 01 +k 10 k 02 

/ J '33 = k 33 + 5 k 32 k oi + ^ k 31 k 0l + ^ k 3l k 02 +k 30 k oi + ^ k 30 k oi k 02 +k 30 k 03 

+ 3 k 23 k 1 o +9k 22 k 11 +9k 22 k 10 k 01 +9k 21 k l2 +18k 21 k 11 k 01 +9k 21 k lO k 01 
+ 9k2 1 k 10 k 0 2 + 5 k 20 k 13 +9k 20 k 12 k 01 +9k 20 k 11 k 01 +9k 20 k 11 k 02 
+ 5 k 20 k 10 k 01 + ^ k 20 k 10 k 01 k 02 + ^ k 20 k 1 0 k 03 + ^ k 1 3 k 10 +l8k l2 k 1 iNo 


f 


"12 1(T01 




,2 

C 11~1CT01' 


+ 9k-.o k „ o k rn"*"6k^+18k^ k 4ri k n ^+9k^ ^ k, „lc^ J +9k J ^k^„k, 


11 K 10 k 01 +5k 11 k 10"02 


+ k 10 k 01 +5k 10 k 01 k 02 +k 10 k 03 * 



206 


D.6 An Alternative Expression for [ { (a+r)/4}log (l+xa"’"*')] 

■0 


Using expansion of log e (l + xa” A ) in terms of (xa~ -i ‘) we 


have 


(-^)log e (l+xa"- L ) 


-1> 


= (^E) [xa- 1 - 4- a 


2 ”2 x? -3 x 4 -4 x 5 -5 

2 a + T a - -ZT a + "5“ a 


2^ a" 6 + £- a" 7 + ...] 


»J»I Pi — 


12 


x „-l x —2 
“ "8" a 




X 4 -3 

x? —4 

X 6 . 

~ T£ 

+ 2^ a 

- -^r , 

x? -3 

x 4 -4 , 

>? — ! 

12 a ' 


25 a 


28 


< 6 

25 


5 — a~^+. . .] 


= ^ ^ (2rx~x 2 )a“ 1 + -~g (2x?-3rx 2 )a“ 2 + ^ (4rx^-3x 4 ) a 

+ ^ (4x^-5rx 4 )a"~ 4 + (6rx^-5x^)a"^ + ■j|^(7rx^-6x 7 )a ^ + 


i.e. 


log e (l+xa "*■) = ^ + c^(a x ) 1 . 

So far as the coefficients of i = 1,2,..., 00 on the right 

hand side of the above equation are concerned the following 
algorithm can be seen to be effective in specifying any 
coefficient 

c i = 'Z^xi+ i n "' ^ i+1 ^ ^ ~ 


00 


,-u± 


i = 1,2, .. . 



REFERENCES 


Anderson, T.W, (1957). f 'An Introduction to Multivariate 
Statistical Analysis ." Wiley, New York. 

Andrews, D.F. (1971). Sequentially designed experiments for 
screening out bad models with F tests. Biometrika, 58, 427-432. 

l 

Atkinson, A.C. (1978). Posterior probabilities for choosing 
a regression model. Biometrika, 65, 39-48-. 

Atkinson, A.C. and Cox, D.R. (1974). Planning experiments for 
discriminating between models. J. Roy. Statist. Soc. B, 36, 
321-348. 

. 

Atkinson, A.C. end Fedorov, V.V. (1975a). The design of 
experiments for discriminating between two rival models. 
Biometrika, 62, 57-70. 

Atkinson, A.C. and Fedorov, V.V. (1975b)-. Optimal design* 
experiments for discriminating between several models. 

Biometrika, 62, 289-303. 

Bellman, R. (1974). "Introduction to Matrix Analysis .” Tata 
McGraw Hill Co., New Delhi, 

Box, G.E.P. (1949). A general distribution theory for a class 
of likelihood criteria. Biometrika, 36, 317-346. 

Box, G.E.P. and Henson, T.L. (1970). Some aspects of mathematic) 
modelling in chemical engineering. Proc. Inaug. Conf. Scient.Co: 
Centre, Cairo Univ* Press, C a iro. 

, x I 

Box, G.E.P. and Hill, W.J. (1967). Discriminating among 
mechanistic models. Technometrics, 9, 57-71. 

Box, M.J. ( 1966 ). A comparison of several current optimization! 
methods, and the use of transformations in constrained problems.! 
Comput. J., 9, 67-77. 

Buzzi-Ferraris , G. and F 0 rzatti, P. (1983)'. A new sequential j 
experimental design procedure for discriminating among rival j 
models. Chem. Eng. Sc., 38 , 225-232. • j 

Buzzi-Ferraris, G., Forzatti, P. , Emig, G. , and Hofman, H. (1984] 
Sequential experimental design for model discrimination in case 
of multiple responses. Chem. Eng. Sc., 39, 81-85. [ 



208 


Cook, M.B. (1951). Bi-variate k-statistics and cumulants of 
their joint sampling distribution. Biometrika, 38, 179-195 « 

Domez, F. and Froment, G.F. (1976). Dehydrogenation of 1-Butene 
into Butadiene. Kinetics, catalyst cooking, and reactor design. 
Ind. Eng, Chem. Proc. Design and Development, 15, 291-301. 

Fedorov, V.V. (1972). "Theory of Optimal Experiments." Academic; 
Press., New York. 

Fedorov, V.V. and Pazman, A. (1968), Design of physical 
experiments. Fortschritte der Physik. , 16, 325-355. 

Froment, G.F. (1975). Model discrimination and parameter 
estimation in heterogeneous catalysis. AICh.E J., 21, i 

1041-1057. 

; 

Froment, G. and Mezaki, R. (1970). Sequential discrimination 
and estimation procedures for rate modelling in heterogeneous 
catalysis. Chem. Eng. Sc., 25, 293-300. j 

Giri, N.C. (1977). "Multivariate Statistical Inference." 

Academic Press, New York.- f 

Hellinger, E. (1909). Neue bergrundung der Theorie gradrati sheri 
formen Von unendlichvielen Verauderlichen* J. fur. reine and 
angew. Mathematic, 136, 210. 

Hill, P.D.H. (1976). Optimal experimental designs for model 
discrimination. Ph.D, Thesis, Univ. of Glasgow, U.K, 

Hill, P.D.H. (1978). A review of experimental design procedures 
for regression model discrimination. Technometrics, 20, 15-21. 

Hill, W.J. and Hunter, W.G. (1967). Design of experiments 
for model discrimination in multiresponse situations. Tech. 

Rep. No. 65, Deptt. of Stat., Univ. of Wisconsin, Madison. 

Hill, W.J. and Hunter, W.G. (1969). A note on designs for 
model discrimination ; Variance unknown case. Technometrics, 

11, 396-400. | 

Hill, W.J,, Hunter, W.G., and Wichern, D.W. (1968). A joint 
design criterion for the dual problem of model discrimination 
and parameter estimation. Technometrics, 10, 145-160, 

Hosten, L.H, and Froment, G.F, (1976). Non Bayesian sequential 
experimental design procedures for optimal discrimination j 

between rival models, Proc. 4th Int. Symp-, on Chan. React. Eng. 
Heidelberg, 11-113. 


1 



Hsiang, T, and Reilly, P.M. (l97l). A practical method for 
discriminating among mechanistic models. Can. J. Chem. Eng., 

49, 865-871. 

I 

Hunter, W.G. and Mezaki, R. (1967). An experimental design 
strategy for distinguishing among rival mechanistic models - 
an application. Can. J. Chem. Eng., 45, 247-249. 

, 

Hunter, W.G. and Reiner, A.M. (1965). Designs for discriminating 
between two rival models. Technometrics, 7, 307-525. 

Hunter, W.G. and Wichern, D.W. (1966). Tech, Rep* No. 33, 

Deptt. Chem. Eng. and Stat., Univ. of Wisconsin, Madison. 

' I 

Iyengar, S. S. and Rao, M.S. (1983). Statistical techniques 
in modelling of complex systems • Single and multiresponse models 
IEEE Trans, on Sys s Man and Cybernetics, SMC-13, 175-189. 

Jeffreys, H. (1961). "Theory of Probability." Clarendon Press, 
Oxford. 

Meter, D.A, , Pirie, W. , and Blot, W. (1970). A comparison 
of two model discrimination criteria. Technometrics, • 12, 

457-470. 

Pazman, A. and Fedorov, V.V. (1968). Planning of regression 
and discrimination experiments on NN scattering. Soviet J. of 
Nuc. Phy., 6, 619-621. 

Prasad, K.B.S. and Rao, M.S, (1977). Use of expected likelihood 
in sequential model discrimination in multiresponse systems. 

Chem. Eng. Sc., 32, 1411-1418. 

Rao, M.S. and Iyengar, S.S. (1984). Application of statistical 
techniques in modelling of complex systems-."Computer Modelling 
of Complex Biological Systems, CRC Press, Inc., Boca Raton, 
Florida (Ed. Iyengar, S.S. ). 

' 

Reilly, P.M. (1970). Statistical methods in model discriminatiq 
Can. J. Chem. Eng., 48, 168-173. I 

Reilly, P.M, and Blau, G.E. (1974). The use of statistical 
methods to build mathematical models for chemical racting 
systems. Can. J. Chem. Eng., 52, 289-299. 

! 

' ' 

Roth, P.M. (1965). Design of experiments for discriminating \ 
among rival models, Ph.D. Thesis. Princeton Univ,, USA* 

Shannon, C.E. (1948). "A mathematical theory of communication,” 
Bell. System. Tech. J., 27, 379-423. 



210 


Siddik, S.M, (1972). Kullback-Leibler information function 
and the sequential selection of experiments to discriminate 
among several linear models, Ph.D, Thesis, Univ, of 
Pennsylvania, USA, 

Singh, S. and Rao, M.S. (l98l). Parameter estimation and model 
discrimination in multiresponse models, Indian Chem, Engineer, 
XXIII, 19-26. 

Wentzheimer, ¥, ¥, (1970). Modelling of heterogeneous catalysed 
reactions using statistical experimental designs and data 
analysis. Ph.D. Thesis, Univ. of Pennsylvania, USA. 




